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Description 

BACKGROUND OF THE INVENTION 

5 1 . Field of the Invention 

[0001] Tlie present invention relates to novel polynucleotides derived from microorganlsnris belonging to coryneform 
bacteria and fragments tliereof, polypeptides encoded by the polynucleotides and fragments thereof, polynucleotide 
arrays comprising the polynucleotides and fragments thereof, computer readable recording media in which the nucle- 
ic? otide sequences of the polynucleotide and fragments thereof have been recorded, and use of them as well as a method 
of using the polynucleotide and/or polypeptide sequence Infomiatlon to make comparisons. 

2. Brief Description of the Background Art 

15 [0002] Coryneform bacteria are used in producing various useful substances, such as amino acids, nucleic acids, 
vitamins, saccharides (for example, ribulose), organic acids (for example, pyruvic acid), and analogues of the above- 
described substances (for example, N-acetylamino acids) and are very useful microorganisms industrially. IVIany mu- 
tants thereof are known. 

[0003] For example, Corynebacterium glutamicum is a Gram-positive bacterium identified as a glutamic acid-pro- 
20 ducing bacterium, and many amino acids are produced by mutants thereof. For example, 1 ,000,000 ton/year of L- 
glutamic acid which is useful as a seasoning for umami (delicious taste), 250,000 ton/year of L-lysine which is a valuable 
additive for livestock feeds and the like, and several hundred ton/year or more of other amino acids, such as L-arginine, 
L-proline, L-giutamine, L-tryptophan, and the like, have been produced In the world (Nikkei Bio Yearbook 99, published 
by Nikkei BP (1998)), 

25 [0004] The production of amino acids by Corynebacterium gtutamicum is mainly carried out by its mutants (metabolic 
mutants) which have a mutated metabolic pathway and regulatory systems. In general, an organism is provided with 
various metabolic regulatory systems so as not to produce more amino acids than it needs. In the biosynthesis of L- 
lysine, for example, a microorganism belonging to the genus Corynebacterium is under such regulation as preventing 
the excessive production by concerted inhibition by lysine and threonine against the activity of a biosynthesis enzyme 

30 common to lysine, threonine and methionine, i.e., an aspartokinase, (J. Biochem., 65: 849-859 (1969)). The biosyn- 
thesis of arginine is controlled by repressing the expression of its biosynthesis gene by arginlne so as not to biosyn- 
thesize an excessive amount of arginine (Microbioiogy, 142-. 99-108 (1996)). It is considered that these metabolic 
regulatory mechanisms are deregulated in amino acid-producing mutants. Similarly, the metabolic regulation is dereg- 
ulated in mutants producing nucleic acids, vitamins, saccharides, organic acids and analogues of the above-described 

35 substances so as to improve the productivity of the objective product. 

[0005] However, accumulation of basic genetic, biochemical and molecular biological data on coryneform bacteria 
is Insufficient in comparison with Escherichia coii, Baciilus subtiiis, and the like. Also, few findings have been obtained 
on mutated genes in amino acid-producing mutants. Thus, there are various mechanisms, which are still unknown, of 
regulating the growth and metabolism of these microorganisms. 

40 [0006] A chromosomal physical map of Corynebacterium giutamicum ATCC 13032 is reported and It Is known that 
its genome size Is about 3,1 00 kb (Moi. Gen, Genet, 252: 255-265 (1 996)). Calculating on the basis of the usual gene 
density of bacteria, It is presumed that about 3,000 genes are present in this genome of about 3,100 kb. However, only 
about 100 genes mainly concerning amino acid biosynthesis genes are known in Corynebacterium giutamicum, and 
the nucleotide sequences of most genes have not been clarified hitherto. 

45 [0007] In recent years, the full nucleotide sequence of the genomes of several microorganisms, such as Escherichia 
coli, Mycobacterium tubercuiosis, yeast, and the like, have been determined {Science, 277: 1453-62 (1997); Nature, 
393: 537-544 (1 998); Nature, 387: 5-1 06 (1 997)), Based on the thus detemilned full nucleotide sequences, assumption 
of gene- regions and prediction of their function by comparison with the nucleotide sequences of known genes have 
been carried out. Thus, the functions of a great number of genes have been presumed, without genetic, biochemical 

50 or molecular biological experiments. 

[0008] In recent years, moreover, techniques for monitoring expression levels of a great number of genes simulta- 
neously or detecting mutations, using DNA chips, DNA arrays or the like In which a partial nucleic acid fragment of a 
gene or a partial nucleic acid fragment in genomic DNA other than a gene is fixed to a solid support, have been 
developed. The techniques contribute to the analysis of microorganisms, such as yeasts, Mycobacterium tubercuiosis, 

55 Mycobacterium bovis used in BCG vaccines, and the like {Science, 278: 680-686 (1 997); Proc. Natl Acad. Sd. USA, 
96: 12833-38 (1999); Science, 264: 1520-23 (1999)). 
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SUMMARY OF THE INVENTION 

[0009] An object of the present invention is to provide a polynucleotide and a polypeptide derived from a microor- 
ganism of coryneform bacteria which are industrially useful, sequence infomriation of the polynucleotide and the 
5 polypeptide, a method for analyzing the microorganism, an apparatus and a system for use In the analysis, and a 
method for breeding the microorganism. 

[0010] The present invention provides a polynucleotide and an oligonucleotide derived from a microorganism be- 
longing to corynefonn bacteria, oligonucleotide arrays to which the polynucleotides and the oligonucleotides are fixed, 
a polypeptide encoded by the polynucleotide, an antibody which recognizes the polypeptide, polypeptide arrays to 
10 which the polypeptides or the antibodies are fixed, a computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the recording medium as well as a method of using the polynu- 
cleotide and/or polypeptide sequence infomriation to make comparisons. 

15 BRIEF DESCRIPTION OF THE DRAWING 

[0011] Fig. 1 is a map showing the positions of typical genes on the genome of Corynebacterium glutami'cum ATCC 
13032. 

[0012] Fig. 2 Is electrophoresis showing the results of proteome analyses using proteins derived from (A) Coryne- 
20 bacterium glutamicum ATCC 1 3032, (B) FERM BP-71 34, and (C) PERM BP-1 58. 

[001 3] Fig. 3 is a flow chart of an example of a system using the computer readable media according to the present 

invention. 

[0014] Fig. 4 Is a flow chart of an example of a system using the computer readable media according to the present 
invention. 

25 

DETAILED DESCRIPTION OF THE INVENTION 

[0015] This application is based on Japanese applications No. Hei. 11-377484 filed on December 16, 1999, No. 
2000-159162 filed on April 7, 2000 and No. 2000-280988 filed on August 3, 2000, the entire contents of which are 

30 incorporated hereinto by reference. 

[0016] From the viewpoint that the determination of the full nucleotide sequence of Corynebacterium giutamicum 
would make it possible to specify gene regions which had not been previously Identified, to determine the function of 
an unknown gene derived from the microorganism through comparison with nucleotide sequences of known genes 
and amino acid sequences of known genes, and to obtain a useful mutant based on the presumption of the metabolic 

35 regulatory mechanism of a useful product by the microorganism, the Inventors conducted intensive studies and, as a 
result, found that the complete genome sequence of Corynebacterium giutamicum can be detennlned by applying the 
whole genome shotgun method. 

[0017] Specifically, the present Invention relates to the following (1) to (66): 
40 (1 ) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived from a corynefonn bacterium, 

(C) analyzing an expression profile of a gene derived from a corynefonn bacterium, 
45 (D) analyzing expression patterns of genes derived from a corynefonn bacterium, or 

(E) Identifying a gene homologous to a gene derived from a coryneform bacterium, 
said method comprising: 

(a) producing a polynucleotide an'ay by adhering to a solid support at least two polynucleotides selected 
50 from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any ' 

one of SEQ ID N0S:1 to 3501 , second polynucleotides which hybridize with the first polynucleotides under 
stringent conditions, and third polynucleotides comprising a sequence of 1 0 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
55 ryneform bacterium, a labeled polynucleotide derived from a mutant of the corynefonn bacterium or a 

labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 
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As used herein, for example, the at least two polynucleotides can be at least two of the first polynu- 
cleotides, at least two of the second polynucleotides, at least two of the third polynucleotides, or at least 
two of the first, second and third polynucleotides. 

5 (2) The method according to (1), wherein the coryneform bacterium is a microorganism belonging to the genus 

Corynebacterium, the genus Brevibacterium, or the genus MIcrobacterium. 

(3) The method according to (2), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetogtutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 

10 urn melassecota, Corynebacterium tfiermoaminogenes, and Corynebacterium ammoniagenes. 

(4) The method according to (1), wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
lotide derived from a mutant of the coryneform bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. 

IS (5) The method according to (1), wherein the polynucleotide to be examined is derived from Escherichia coll, 

(6) A poiynucieotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID N0S:1 to 3501 , second polynucleotides which hybridize 
20 with the first polynucleotides under stringent conditions, and third polynucleotides comprising 1 0 to 200 con* 

tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

As used herein, for example, the at least two polynucleotides can be at least two of the first polynucleotides, 
25 at least two of the second polynucleotides, at least two of the third polynucleotides, or at least two of the first, 

second and third polynucleotides. 

(7) A polynucleotide comprising the nucleotide sequence represented by SEQ ID N0:1 or a polynucleotide having 
a homology of at least 80% with the polynucleotide. 

(8) A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID N0S:2 to 3431 , or 
30 a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

(9) A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931 , or a polynucleotide which hybridizes therewith under stringent conditions. 

(10) A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide comprising the 
nucleotide sequence of any one of SEQ ID N0S:2 to 3431 in a whole polynucleotide comprising the nucleotide 

35 sequence represented by SEQ ID N0:1 , and has an activity of regulating an expression of the poiynucieotide. 

(11) A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of 
any one of (7) to (1 0), or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

(1 2) A recombinant DNA comprising the polynucleotide of any one of (8) to (11 ), 

40 (13) A transformant comprising the polynucleotide of any one of (8) to (11) or the recombinant DNA of (12). 

(14) A method for producing a polypeptide, comprising: 

culturing the transformant of (13) in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 
45 recovering the polypeptide from the medium. 

(1 5) A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof, comprising: 

50 culturing the transformant of (13) In a medium to produce and accumulate at least one of an amino acid, a 

nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

S5 (1 6) A polypeptide encoded by a poiynucieotide comprising the nucleotide sequence selected from SEQ ID NOS: 

2 to 3431 . 

(17) A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931 . 

(18) The polypeptide according to (16) or (17), wherein at least one amino acid Is deleted, replaced, Inserted or 
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added, said polypeptides having an activity which is substantiaiiy the same as that of the polypeptide without said 
at least one amino acid deletion, replacement, Insertion or addition. 

(19) A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid 
sequence of the polypeptide of (16) or (17), and having an activity which is substantially the same as that of the 

5 polypeptide. 

(20) An antibody which recognizes the polypeptide of any one of (16) to (19). 

(21) A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of (16) to (19) and 
10 partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

15 at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 

tides of (1 6) to (1 9) and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

(23) A system based on a computer for identifying a target sequence or a target structure motif derived from a 
20 corynefonn bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID N0S:1 
to 3501 , and target sequence or target structure motif information; 
(11) a data storage device for at least temporarily storing the input information; 
25 (jjj) a comparator that compares the at least one nucleotide sequence infomiatlon selected from SEQ ID NOS: 

1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif Information; and 

(Iv) an output device that shows a screening or analyzing result obtained by the comparator. 

30 

(24) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
corynefonn bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence infonnation selected from SEQ ID NOS:1 to 3501, target se- 
35 quence infonnation or target structure motif Infomnation into a user Input device; 

(il) at least temporarily storing said infomnation; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID N0S:1 to 3501 with 
the target sequence or target structure motif information; and 

(iv) screening and analyzing nucleotide sequence infonnation which is coincident with or analogous to the 
40 target sequence or target structure motif Information. 

(25) A system based on a computer for Identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

45 (1) a user Input device that inputs at least one amino acid sequence Infomnation selected from SEQ ID NOS: 

3502 to 7001, and target sequence or target structure motif information; 
(11) a data storage device for at least temporarily storing the input infomnation; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target sequence or target structure motif infomnation, recorded by the data storage 

50 device for screening and analyzing amino acid sequence information which Is coincident with or analogous to 

the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

(26) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
55 corynefomn bacterium, comprising the following: 

(I) Inputting at least one amino acid sequence Infonnation selected from SEQ ID NOS:3502 to 7001 , and target 
sequence information or target structure motif Information into a user Input device; 
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(ii) at least temporarily storing said information; 

(lii) comparing the at least one amino acid sequence Infomiation selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence infomiation which is coincident with or analogous to the 
target sequence or target structure motif information. 

(27) A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having 
a target nucleotide sequence derived from a corynefonri bacterium, comprising the following: 

(I) a user Input device that inputs at least one nucleotide sequence infomiation selected from SEQ ID N0S:2 
to 3501, function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(ii) a data storage device for at least temporarily storing the input Infomriation; 

(lii) a comparator that compares the at least one nucleotide sequence infomiation selected from SEQ ID NOS: 
2 to 3501 with the target nucleotide sequence infomiation, and detemfiining a function of a polypeptide encoded 
by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID N0S:2 to 3501 ; and 
(iv) an output devices that shows a function obtained by the comparator. 

(28) A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded 
by a polynucleotide having a target nucleotide sequence derived from a corynefomi bacterium, comprising the 

following: 

(i) inputting at least one nucleotide sequence infomiation selected from SEQ ID N0S:2 to 3501 , function in- 
fomiation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence infomiation; 

(ii) at least temporarily storing said infomiation; 

(iii) comparing the at least one nucleotide sequence infomiation selected from SEQ ID N0S:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) detemilning a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 
from SEQ ID N0S:2 to 3501 . 

(29) Asystem based on a computerfor determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) a user input device that Inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function information based on the amino acid sequence, and target amino acid sequence Infor- 
mation; 

(ii) a data storing device for at least temporarily storing the input Information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence infomiation for determining a function of a polypeptide 
having the target amino acid sequence which is coincident with or analogous to the polypeptide having at least 
one amino acid sequence selected from SEQ ID NOS:3502 to 7001 ; and 

(iv) an output device that shows a function obtained by the comparator. 

(30) A method based on a computerfor determining a function of a polypeptide having a target amino acid sequence 
derived from a corynefomi bacterium, comprising the following: 

(i) inputting at least one amino acid sequence infomiation selected from SEQ ID NOS:3502 to 7001 , function 
information based on the amino acid sequence, and target amino acid sequence infomiation; 

(ii) at least temporarily storing said Infomiation; 

(ill) comparing the at least one amino acid sequence infomiation selected from SEQ ID NOS:3502 to 7001 
with the target amino acid sequence infomiation; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 

(31) The system according to any one of (23), (25), (27) and (29), wherein a corynefonn bacterium is a microor- 
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ganism of the genus Corynebacterium, the genus Brevlbacterium, or the genus Microbacterium. 

(32) The method according to any one of (24), (26), (28) and (30), wherein a corynefomi bacterium Is a microor- 
ganism of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(33) The system according to (31 ), wherein the microorganism belonging to the genus Corynebacterium is selected 
5 from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophitum, Corynebacterium 

acetoglutamlcum, corynebacterium cailunae, corynebacterium herculis, Corynebacterium iiiium, Corynebacterium 
meiassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(34) Themethod according to (32), wherein the microorganism belonging to the genus Corynebacterium\s selected 
from the group consisting of Corynebacterium giutamicum, Corynebacterium acetoacidophitum, Corynebacterium 

10 acetoglutamicum, Corynebacterium caliunae, Corynebacterium hercuiis, Corynebacterium iiiium, Corynebacteri- 

um melassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(35) A recording medium or storage device which is readable by a computer in which at least one nucleotide 
sequence infomnation selected from SEQ ID N0S:1 to 3501 or function infonnation based on the nucleotide se- 
quence is recorded, and is usable in the system of (23) or (27) or the method of (24) or (28). 

15 (36) A recording medium or storage device which is readable by a computer in which at least one amino acid 

sequence information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid 
sequence is recorded; and is usable in the system of (25) or (29) or the method of (26) or (30). 

(37) The recording medium or storage device according to 

(35) or (36), which is a computer readable recording medium selected from the group consisting of a floppy disc, 
20 a hard disc, a magnetic tape, a random access memory (RAIVI), a read only memory (ROM), a magneto-optic disc 

(MO), CD-ROM, CD-R, CD-RW, DVD-ROM. DVD-RAM and DVD-RW. 

(38) A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the 
Val residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a corynefomi 
bacterium Is replaced with an amino acid residue other than a Val residue. 

25 (39) A polypeptide comprising an amino acid sequence In which the Val residue at the 59th position in the amino 

acid sequence as represented by SEQ ID NO;6952 is replaced with an amino acid residue other than a Val residue. 

(40) The polypeptide according to (38) or (39), wherein the Val residue at the 59th position is replaced with an Ala 
residue. 

(41) A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro 
30 residue at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform 

bacterium is replaced with an amino acid residue other than a Pro residue. 

(42) A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino 
acid sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

(43) The polypeptide according to (41) or (42), wherein the Pro residue at the 458th position is replaced with a Ser 
35 residue. 

(44) The polypeptide according to any one of (38) to (43), which is derived from Corynebacterium gtutamicum. 

(45) A DNA encoding the polypeptide of any one of (38) to (44). 

(46) A recombinant DNA comprising the DNA of (45). 

(47) A transformant comprising the recombinant DNA of (46). 

40 (48) A transfomriant comprising In Its chromosome the DNA of (45). 

(49) The transfomiant according to (47) or (48), which Is derived f rom a corynefomn bacterium. 

(50) The transformant according to (49), which Is derived from Corynebacterium glutamicum. 

(51 ) A method for producing L-lysine, comprising: 

45 culturing the transfomiant of any one of (47) to (50) in a medium to produce and accumulate L-lysine in the 

medium, and 

recovering the L-lysine from the culture. 

(52) A method for breeding a corynefomi bacterium using the nucleotide sequence information represented by 
50 SEQ ID N0S:1 to 3431 , comprising the following: 

(I) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
55 method, with a corresponding nucleotide sequence in SEQ ID N0S:1 to 3431 ; 

(il) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which Is free of the mutation point; and 

(iv) examining productivity by the fennentatlon method of the compound selected in (I) of the corynefomn 
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(53) The method according to (52), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 
5 (54) The method according to (52), wherein the mutation point is a mutation point relating to a useful mutation 

which improves or stabilizes the productivity. 

(55) A method for breading a corynefomn bacterium using the nucleotide sequence information represented by 
SEQ ID N0S:1 to 13431 . comprising: 

10 (i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 

rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID N0S:1 to 3431; 
(ii) Identifying a mutation point present in the production strain based on a result obtain by (i); 

^5 (iii) deleting a mutation point from a corynefomn bacterium having the mutation point; and 

(iv) examining productivity by the fennentation method of the compound selected in (i) of the corynefonn 
bacterium obtained in (iii). 

(56) The method according to (55), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
20 a signal transmission pathway. 

(57) The method according to (55), wherein the mutation point is a mutation point which decreases or destabilizes 

the productivity. 

(58) A method for breeding a corynefomi bacterium using the nucleotide sequence Information represented by 
SEQ ID N0S:2 to 3431 , comprising the following: 

25 

(i) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
quence Information represented by SEQ ID N0S:2 to 3431 ; 

(ii) classifying the isozyme identified in (i) into an isozyme having the same activity; 

30 (ill) mutating all genes encoding the Isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (1) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

(59) A method for breeding a corynefonn bacterium using the nucleotide sequence information represented by 
35 SEQ ID N0S:2 to 3431 , comprising the following: 

(i) arranging a function Infomnation of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431 ; 

(ii) allowing the an^anged ORF to correspond to an enzyme on a l<nown biosynthesis or signal transmission 

pathway; 

40 (iii) explicating an unl<nown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 

in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
rynefonn bacterium; 

(iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
45 strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 

weal<en a pathway which is judged not to be important in the biosynthesis of the target useful product in (iv). 

(60) A corynefonn bacterium, bred by the method of any one of (52) to (59). 

(61) The coryneform bacterium according to (60), which is a microorganism belonging to the genus Corynebac- 
50 terium, the genus Brevibacterium, or the genus Microbacterium. 

(62) The coryneform bacterium according to (61 ), wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetogiutamicum, Corynebacterium callunae, Corynebacterium tierculis, Corynebacterium til- 
ium, Corynebacterium melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

55 (63) A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a 

saccharide, an organic acid and an analogue thereof, comprising: 

culturing a corynefonn bacterium of any one of (60) to (62) in a medium to produce and accumulate at least 
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one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof; 

recovering the compound from the culture. 

5 (64) The method according to (63), wherein the compound is L-iysine. 

(65) A method for identifying a protein relating to usefui mutation based on proteome analysis, comprising the 
following: 

(i) preparing 

10 

a protein derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent straln.of the production strain; 

15 

(11) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(Iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
20 to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (Iv); and 

(vi) comparing the amino acid sequences obtained In (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

25 As used herein, the tenri "proteome", which Is a coined word by combining "protein" with "genome", refers to 

a method for examining of a gene at the polypeptide level. 

(66) The method according to (65), wherein the corynefonn bacterium is a microorganism belonging to the genus 
Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(67) The method according to (66), wherein the microorganism belonging to the genus Corynebacterium is selected 
30 from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophitum, Corynebacterium 

acetogfutamicum, Corynebacterium callunae, corynebacterium hercutis, Corynebacterium liiium Corynebacterium 
meiassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(68) A biologically pure culture of Corynebacterium giutamicum AHP-3 (FERi\^ BP-7382). 

35 [0018] The present invention will be described below in more detail, based on the detenninatlon of the full nucleotide 
sequence of corynefonn bacteria. 

1 . Detemnination of full nucleotide sequence of corynefonn bacteria 

40 [0019] The tenn "corynefonn bacteria" as used herein means a microorganism belonging to the genus Corynebac- 
terium, the genus Brevibacterium or the genus Microbacterium as defined In Bergeys Manuat of Determinative Bacte- 
no/osKS: 599 (1974). 

[0020] Examples inci ude Corynebacterium acetoacidophilum, Corynebacterium acetoglutamicum, Corynebacterium 
callunae, Corynebacterium giutamicum, Corynebacterium iierculis, Corynebacterium liiium, Corynebacterium melas- 

45 secola, Corynebacterium tiiermoaminogenes, Brevibacterium saccliarolyticum, Brevibacterium immariophilum, Brevi- 
bacterium roseum, Brevibacterium thiogenitaiis, Microbacterium ammoniaphilum, and the like. 
[0021 1 Specific examples include Corynebacterium acetoacidopliiium ATCC 1 3870, Corynebacterium acetoglutami- 
cum ATCC 15806, Corynebacterium callunae ATCC 15991, Corynebacterium giutamicum fiJCC 13032, Corynebac- 
terium giutamicum ATCC ^ 3060, Corynebacterium giutamicum ATCC 13826 (prior genus and species: Brevibacterium 

50 flavum, or Corynebacterium lactofermentum), Corynebacterium giutamicum ATCC 14020 (prior genus and species: 
Brevibacterium divaricatum), Corynebacterium giutamicum ATCC 13869 (prior genus and species: Brevibacterium 
lactofermentum), Corynebacterium hercuiis ATCC 13868, Corynebacterium liiium ATCC 15990, Corynebacterium 
meiassecoia ATCC 17965, Corynebacterium thermoaminogenes FERM 9244, Brevibacterium sacctiarolyticum fiiTCC 
14066, Brevibacterium immariopiiilum ATCC 14068, Brevibacterium roseumPiTCC 13825, Brevibacterium thiogenitaiis 

ss ATCC 1 9240, Microbacterium ammoniaphilum ATCC 1 5354, and the lil<e. 
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(1) Preparation of genome DNA of cofynefomn bacteria 

[0022] Corynefonn bacteria can be cultured by a conventional method. 

[0023] Any of a natural medium and a synthetic medium can be used, so long as it is a medium suitable for efficient 
5 culturing of the microorganism, and It contains a carbon source, a nitrogen source, an inorganic salt, and the like which 
can be assimilated by the microorganism. 

[0024] In Corynebacterium glutamicum, for example, a BY medium (7 g/l meat extract, 1 0 g/l peptone, 3 g/l sodium 
chloride, 5 g/l yeast extract, pH 7.2) containing 1% of glycine and the like can be used. The culturing Is earned out at 
25 to 35*C overnight. 

10 [0025] After the completion of the culture, the cells are recovered from the culture by centrifugation. The resulting 
cells are washed with a washing solution. 

[0026] Examples of the washing solution Include STE buffer (1 0.3% sucrose, 25 mmoi/1 Tris hydrochloride, 25 mmol/ 
1 ethylenediaminetetraacetic acid (hereinafter referred to as "EDTA"), pH 8.0), and the like. 
[0027] Genome DNA can be obtained from the washed cells according to a conventional method for obtaining ge- 
15 nome DNA, namely, lysing the cell wall of the cells using a lysozyme and a surfactant (SDS, etc), eliminating proteins 
and the like using a phenol solution and a phenoi/chlorofomri solution, and then precipitating the genome DNA with 
ethanol or the like. Specifically, the following method can be illustrated. 

[0028] The washed cells are suspended in a washing solution containing 5 to 20 mg/1 lysozyme. After shaking, 5 to 
20% SDS is added to lyse the cells. In usual, shaking is gently perfomned at 25 to 40^C for 30 minutes to 2 hours. After 
20 shaking, the suspension is maintained at 60 to 70®C for 5 to 15 minutes for the lysis. 

[0029] After the lysis, the suspension is cooled to ordinary temperature, and 5 to 20 ml of Tris-neutralized phenol Is 
added thereto, followed by gently shaking at room temperature for 15 to 45 minutes. 

[0030] After shaking, centrifugation (15,000 x g, 20 minutes, 20^*0) is carried out to fractionate the aqueous layer 
[0031] After perfonning extraction with phenol/chlorofomri and extraction with chlorofomn (twice) in the same manner, 

2s 3 mol/l sodium acetate solution (pH 5.2) and Isopropanol are added to the aqueous layer at 1/10 times volume and 2 
times volume, of the aqueous layer, respectively, followed by gently stinging to precipitate the genome DNA. 
[0032] The genome DNA is dissolved again in a buffer containing 0.01 to 0.04 mg/ml RNase. As an example of the 
buffer, TE buffer (10 mmol/1 Tris hydrochloride, 1 mol/l EDTA, pH 8.0) can be used. After dissolving, the resultant 
solution is maintained at 25 to 40°C for 20 to 50 minutes and then extracted successively with phenol, phenol/chloroform 

30 and chlorofomri as In the above case. 

[0033] After the extraction, isopropanol precipitation Is carried out and the resulting DNA precipitate Is washed with 
70% ethanol, followed by air drying, and then dissolved in TE buffer to obtain a genome DNA solution. 

(2) Production of shotgun library 

35 

[0034] A method for produce a genome DNA library using the genome DNA of the corynefomn bacteria prepared in 
the above (1 ) Include a method described in Molecular Cloning, A laboratory Manual, Second Edition (1 989) (hereinafter 
referred to as "Molecular Cloning, 2nd ed."). In particular, the following method can be exemplified to prepare a genome 
DNA library appropriately usable in determining the full nucleotide sequence by the shotgun method. 
40 [0035] To 0.01 mg of the genome DNA of the coryneform bacteria prepared In the above (1) , a buffer, such as TE 
buffer or the like, Is added to give a total volume of 0.4 ml. Then, the genome DNA is digested into fragments of 1 to 
10 kb with a sonlcator (Yamato Powersonic Model 50). The treatment with the sonlcator is perfonned at an output of 
20 continuously for 5 seconds. 

[0036] The resulting genome DNA fragments are blunt-ended using DNA blunting kit (manufactured by Takara Shuzo) 
45 or the like. 

[0037] The blunt-ended genome fragments are fractionated by agarose gel or polyacrylamlde gel electrophoresis 
and genome fragments of 1 to 2 kb are cut out from the get. 

[0038] To the gel, 0.2 to 0.5 ml of a buffer for eluting DNA, such as MG elution buffer (0.5 mol/i ammonium acetate, 
10 mmol/1 magnesium acetate, 1 mmol/l EDTA, 0.1% SDS) or the like, is added, followed by shaking at 25 to 40**C 
50 overnight to elute DNA. 

[0039] The resulting DNA eluate is treated with phenol/chloroform and then precipitated with ethanol to obtain a 
genome library Insert. 

[0040] This insert is ligated into a suitable vector, such as pUCI 8 Smal/SAP (manufactured by Amersham Phannacia 
Biotech) or the like, using T4 ligase (manufactured by Takara Shuzo) or the like. The ligation can be carried out by 
55 allowing a mixture to stand at 1 0 to 20*C for 20 to 50 hours. 

[0041] The resulting ligation product is precipitated with ethanol and dissolved in 5 to 20 |j,l of TE buffer. 

[0042] Escherichia coll is transformed In accordance with a conventional method using 0,5 to 2 |xl of the ligation 

solution. Examples of the transfonnatlon method include the electroporatlon method using ELECTRO MAX DHIOB 
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(manufactured by Life Technologies) for Escherichia coli. The electroporatlon method can be carried out under the 
conditions as described in the manufacturer's instructions. 

[0043] The transfomied Escherichia coii is spread on a suitable selection medium containing agar, for example, LB 
plate medium containing 10 to 100 mg/l amplcillin (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/i sodium 
5 chloride, pM 7.0) containing 1 .6% of agar) when pUC18 Is used as the cloning vector, and cultured therein. 

[0044] The transfomnant can be obtained as colonies formed on the plate medium. In this step, it is possible to select 
the transformant having the recombinant DNA containing the genome DNA as white colonies by adding X-gal and 
IPTG (isopropyl-p-thiogalactopyranoside) to the plate medium. 

[0045] The transformant is allowed to stand for culturing in a 96-well titer plate to which 0.05 ml of the LB medium 
10 containing 0.1 mg/ml of ampiciilln has been added in each well. The resulting culture can be used in an experiment of 
(4) described below. Also, the culture solution can be stored at -BO^'C by adding 0.05 ml per well of the LB medium 
containing 20% glycerol to the culture solution, followed by mixing, and the stored culture solution can be used at any 
time. 

15 (3) Production of cosmid library 

[0046] The genome DNA (0.1 mg) of the coryneform bacteria prepared in the above (1) is partially digested with a 
restriction enzyme, such as SaU3A\ or the lil<e, and then ultracentrifuged (26,000 rpm, 18 hours, 20°C) under a 10 to 
40% sucrose density gradient using a 1 0% sucrose buffer (1 mol/l Nad, 20 mmol/l Tris hydrochloride, 5 mmol/l EDTA, 
20 1 0% sucrose, pH 8,0) and a 40% sucrose buffer (elevating the concentration of the 1 0% sucrose buffer to 40%) . 

[0047] After the centrifugation, the thus separated solution is fractionated into tubes in 1 ml per each tube. After 
confimning the DNA fragment size of each fraction by agarose gel electrophoresis, a fraction rich in DNA fragments of 
about 40 kb is precipitated with ethanol. 

[0048] The resulting DNA fragment is ligated to a cosmid vector having a cohesive end which can be ligated to the 
25 fragment. When the genome DNA is partially digested with SaUSA\, the partially digested product can be ligated to, 
for example, the BamH\ site of superCosI (manufactured by Stratagene) in accordance with the manufacture's instruc- 
tions. 

[0049] The resulting ligation product is pacl<aged using a pacl<aging extract which can be prepared by a method 
described in Moiecuiar Cioning, 2nd ed. and then used in transforming Escherichia coli. More specifically, the ligation 
30 product is packaged using, for example, a commercially available packaging extract, Gigapack III Gold Packaging 
Extract (manufactured by Stratagene) in accordance with the manufacture's Instructions and then introduced info Es- 
cherichia coli XL-1 -BlueM R (manufactured by Stratagene) or the like. 

[0050] The thus transfonned Escherichia coli is spread on an LB plate medium containing ampiciilln, and cultured 
therein. 

35 [0051] The transformant can be obtained as colonies formed on the plate medium. 

[0052] The transformant is subjected to standing culture In a 96-well titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml amplcillin has been added. 

[0053] The resulting culture can be employed in an experiment of (4) described below. Also, the culture solution can 
be stored at -SO^C by adding 0.05 ml per well of the LB medium containing 20% glycerol to the culture solution, followed 
40 by mixing, and the stored culture solution can be used at any time. 

(4) Detemiination of nucleotide sequence 

(4-1 ) Preparation of template 

45 

[0054] The full nucleotide sequence of genome DNA of corynefomi bacteria can be determined basically according 
to the whole genome shotgun method (Science, 269: 496-512 (1 995)). 

[0055] The template used in the whole genome shotgun method can be prepared by PGR using the library prepared 
In the above (2) (DNA Research, 5: 1 -9 (1 998)). 
50 [0056] Specifically, the template can be prepared as follows. 

[0057] The clone derived from the whole genome shotgun library is inoculated by using a replicator (manufactured 
by GENETIX) Into each well of a 96*well plate to which 0.08 ml per well of the LB medium containing 0.1 mg/ml ampiciilln 
has been added, followed by stationarlly culturing at 37^*0 overnight. 

[0058] Next, the culture solution is transported, using a copy plate (manufactured by Tokken), Into each well of a 
55 96-well reaction plate (manufactured by PE Biosystems) to which 0.025 ml per well of a PGR reaction solution has 
been added using TaKaRa Ex Taq (manufactured by Takara Shuzo). Then, PGR is carried out in accordance with the 
protocol by Makino et al. (DNA Research, 5: 1-9 (1998)) using GeneAmp PGR System 9700 (manufactured by PE 
Biosystems) to amplify the inserted fragments. 
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[0059] The excessive primers and nucleotides are eliminated using a l<it for purifying a PGR product, and tlie product 
is used as tlie template in tlie sequencing reaction. 

[0060] it is aiso possible to determine the nucleotide sequence using a double-stranded DNA plasmid as a template. 

[0061] The double-stranded DNA plasmid used as the template can be obtained by the following method. 
5 [0062] The clone derived from the whole genome shotgun library is Inoculated Into each well of a 24- or 96-well plate 

to which 1 ,5 ml per well of a 2 x YT medium (16 g/l bactotrypton, 10 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) 

containing 0.05 mg/ml ampicillin has been added, followed by culturing under shaking at 37**C overnight. 

[0063] The double-stranded DNA plasmid can be prepared from the culture solution using an automatic plasmid 

preparing machine KURABO PI-50 (manufactured by Kurabo Industries), a multiscreen (manufactured by Mlllipore) 
10 or the lil<e, according to each protocol.. 

[0064] To purify the plasmid, Biomek 2000 manufactured by Beckman Coulter and the like can be used. 

[0065] The resulting purified double-stranded DNA plasmid is dissolved in water to give a concentration of about 0. 1 

mg/ml. Then, it can be used as the template In sequencing. 

IS (4-2) Sequencing reaction 

[0066] The sequencing reaction can be carried out according to a commercially available sequence kit or the like. A 
specific method is exemplified below. 

[0067]. To 6 |il of a solution of ABI PRISIVI BigDye Temninator Cycle Sequencing Ready Reaction Kit (manufactured 
20 by PE Biosystems), 1 to 2 pmol of an M13 regular direction primer (M13-21) or an M13 reverse direction primer 
(IVIISREV) (DNA Research, 5: 1-9 (1998)) and 50 to 200 ng of the template prepared In the above (4-1) (the PGR 
product or plasmid) to give 10 |il of a sequencing reaction solution. 

[0068] A dye terminator sequencing reaction (35 to 55 cycles) is carried out using this reaction solution and GeneAmp 
PCR System 9700 (manufactured by PE Biosystems) or the like. The cycle parameter can be detemiined In accordance 
25 with a commercially available kit, for example, the manufacture's instructions attached with ABI PRISM Big Dye Ter- 
minator Cycle Sequencing Ready Reaction Kit. 

[0069] The sample can be purified using a commercially available product, such as IVlulti Screen HV plate (manu- 
factured by Millipore) or the like, according to the manufacture's instructions. 

[0070] The thus purified reaction product Is precipitated with ethanol, dried and then used for the analysis. The dried 
30 reaction product can be stored in the dark at -30**C and the stored reaction product can be used at any time. 

[0071] The dried reaction product can be analyzed using a commercially available sequencer and an analyzer ac- 
cording to the manufacture's instructions. 

[0072] Examples of the commercially available sequencer include ABI PRISM 377 DNA Sequencer (manufactured 
by PE Biosystems). Example of the analyzer include ABI PRISM 3700 DNA Analyzer (manufactured by PE Biosystems). 

35 

(5) Assembly 

[0073] A software, such as phred (The University of Washington) or the like, can be used as base call for use in 
analyzing the sequence Infomriation obtained in the above (4). A software, such as Cross.Match (The University of 
40 Washington) or SPS Cross^Match (manufactured by Southwest Parallel Software) or the like, can be used to mask 
the vector sequence information. 

[0074] For the assembly, a software, such as phrap (The University of Washington), SPS phrap (manufactured by 
Southwest Parallel Software) or the like, can be used. 

[0075] In the above, analysis and output of the results thereof, a computer such as UNIX, PC, Macintosh, and the 
45 like can be used. 

[0076] Contig obtained by the assembly can be analyzed using a graphical editor such as consed (The University 
of Washington) or the like. 

[0077] it is also possible to perfonn a series of the operations from the base call to the assembly in a lump using a 
script phred Phrap attached to the consed. 
50 [0078] As used herein, software will be understood to aiso be referred to as a comparator. 

(6) Detennlnation of nucleotide sequence in gap part 

[0079] Each of the cosmlds in the cosmid library constructed in the above (3) is prepared in the same manner as in 
55 the preparation of the double-stranded DNA plasmid described in the above (4-1 ), The nucleotide sequence at the end 
of the insert fragment of the cosmid is determined using a commercially available kit, such as ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit (manufactured by PE Biosystems) according to the manufacture's 
instructions. 
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[0080] About 800 cosmid clones are sequenced at both ends of the inserted fragment to detect a nucleotide sequence 
In the contig derived from the shotgun sequencing obtained in (5) which Is coincident with the sequence. Thus, the 
chain linkage between respective cosmid clones and respective contigs are clarified, and mutual alignment is carried 
out. Furthermore, the results are compared with known physical maps to map the cosmlds and the contigs. In case of 
5 Corynebacterium glutamicum ATCC 1 3032, a physical map of Mot. Gen. Genet, 252: 255-265 (1 996) can be used. 
[0081] The sequence In the region whteh cannot be covered with the contigs (gap part) can be detemfilned by the 
following method. 

[0082] Clones containing sequences positioned at the ends of the contigs are selected. Among these, a clone wherein 
only one end of the inserted fragment has been determined Is selected and the sequence at the opposite end of the 
10 inserted fragment is determined. 

[0083] A shotgun library clone or a cosmid clone derived therefrom containing the sequences at the respective ends 
of the Inserted fragments in the two contigs Is identified and the full nucleotide sequence of the inserted fragment of 
the clone is detenmlned. 

[0084] According to this method, the nucleotide sequence of the gap part can be determined. 
15 [0085] When no shotgun library clone or cosmid clone covering the gap part is available, primers complementary to 

the end sequences of the two different contigs are prepared and the DNA fragment in the gap part is amplified. Then, 

sequencing is performed by the primer walking method using the amplified DNA fragment as a template or by the 

shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment is determined. 

Thus, the nucleotide sequence of the above-described region can be detennined. 
20 [0086] In a region showing a low sequence accuracy, primers are synthesized using AUTOFtNISH function and 

NAVIGATING function of consed (The University of Washington), and the sequence Is detemnined by the primer walking 

method to improve the sequence accuracy. 

[0087] Examples of the thus determined nucleotide sequence of the full genome include the full nucleotide sequence 
of genome of Corynebacterium glutamicum ATCC 1 3032 represented by SEQ ID N0:1 . 

25 

(7) Determination of nucleotide sequence of microorganism genome DNA using the nucleotide sequence represented 

by SEQ ID N0:1 

[0088] A nucleotide sequence of a polynucleotide having a homology of 80% or more with the full nucleotide sequence 
30 of Corynebacterium giutamicum ATCC 13032 represented by SEQ ID N0:1 as detenmlned above can also be deter- 
mined using the nucleotide sequence represented by SEQ ID N0:1, and the polynucleotide having a nucleotide se- 
quence having a homology of 80% or more with the nucleotide sequence represented by SEQ ID N0:1 of the present 
invention is within the scope of the present invention. The term "polynucleotide having a nucleotide sequence having 
a homology of 80% or more with the nucleotide sequence represented by SEQ ID NO: 1 of the present invention" is a 
35 polynucleotide in which a full nucleotide sequence of the chromosome DNA can be detennined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides in the nucleotide sequence represented by SEQ ID NO: 

1 , for exarnple, according to PGR using the chromosome DNA as a template. A particularly preferred primer in deter- 
mination of the full nucleotide sequence is an oligonucleotide having nucleotide sequences which are positioned at 
the interval of about 300 to 500 bp, and among such oligonucleotides, an oligonucleotide having a nucleotide sequence 

40 selected from DNAs encoding a protein relating to a main metabolic pathway is particularly preferred. The polynucle- 
otide in which the full nucleotide sequence of the chromosome DNA can be detennined using the oligonucleotide 
includes polynucleotides constituting a chromosome DNA derived from a microorganism belonging to corynefonn bac- 
. teria. Such a polynucleotide Is preferably a polynucleotide constituting chromosome DNA derived from a microorganism 
belonging to the genus Corynebacterium, more preferably a polynucleotide constituting a chromosome DNA of Co- 

.^5 rynebacterium glutamicum. 

2. Identification of ORF (open reading frame) and expression regulatory fragment and determination of the function of 
ORF 

50 [0089] Based on the full nucleotide sequence data of the genome derived from corynefonn bacteria determined in 
the above item 1 , an ORF and an expression modulating fragment can be identified. Furthermore, the function of the 
thus detennined ORF can be determined. 

[0090] The ORF means a continuous region in the nucleotide sequence of mRNA which can be translated as an 
amino acid sequence to mature to a protein. A region of the DNA coding for the ORF of mRNA is also called ORF 
55 [0091] The expression modulating fragment (hereinafter referred to as "EMF") is used herein to define a series of 
polynucleotide fragments which modulate the expression of the ORF or another sequence ligated pperatably thereto. 
The expression "modulate the expression of a sequence ligated operatably" is used herein to refer to changes in the 
expression of a sequence due to the presence of the EMF. Examples of the EMF include a promoter, an operator, an 
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enhancer, a silencer, a ribosome-binding sequence, a transcriptional ternnination sequence, and the like. In coryneform 
bacteria, an EMF is usually present in an Intergenic segment (a fragment positioned between two genes; about 10 to 
200 nucleotides in length). Accordingly, an EMF Is frequently present in an intergenic segment of 10 nucleotides or 
longer. It Is also possible to determine or discover the presence of an EMF by using known EMF sequences as a target 
5 sequence or a target structural motif (or a target motif) using an appropriate software or comparator, such as FASTA 
{Proc. Natl. Acad. Sci. USA, 85: 2444-48 (1 988)), BLAST (J. Moi Biol., 215: 403-41 0 (1 990)) or the like. Also, it can 
be Identified and evaluated using a known EMF-capturIng vector (for example, pKK232-8; manufactured by Amersham 
Phamriacia Biotech). 

[0092] The term "target sequence" is used herein to refer to a nucleotide sequence composed of 6 or more nucle- 
ic' otides, an amino acid sequence composed of 2 or more amino acids, or a nucleotide sequence encoding this amino 
acid sequence composed of 2 or more amino acids. A longer target sequence appears at random in a data base at 
the lower possibility. The target sequence is preferably about 1 0 to 1 00 amino acid residues or about 30 to 300 nucle- 
otide residues. 

[0093] The temri "target structural motif" or "target motif" Is used herein to refer to a sequence or a combination of 
15 sequences selected optionally and reasonably. Such a motif Is selected on the basis of the threedimensional structure 
formed by the folding of a polypeptide by means known to one of ordinary skill in the art. Various motives are known. 
[0094] Examples of the target motif of a polypeptide Include, but are not limited to, an enzyme activity site, a protein- 
protein interaction site, a signal sequence, and the like. Examples of the target motif of a nucleic acid Include a promoter 
sequence, a transcriptional regulatory factor binding sequence, a hair pin structure, and the like. 
20 [0095] Examples of highly useful EMF include a high-expression promoter, an inducible-expression promoter, and 
the like. Such an EMF can be obtained by positionally detemiining the nucleotide sequence of a gene which is known 
or expected as achieving high expression (for example, ribosomal RNA gene: GenBank Accession No. Ml 61 75 or 
Z46753) or a gene showing a desired induction pattern (for example, Isocitrate lyase gene Induced by acetic acid: 
Japanese Published Unexamined Patent Application No. 56782/93) via the alignment with the full genome nucleotide 
25 sequence detemiined in the above Item 1 , and isolating the genome fragment in the upstream part (usually 200 to 500 
nucleotides from the translation initiation site). It is also possible to obtain a highly useful EMF by selecting an EMF 
showing a high expression efficiency or a desired Induction pattern from among promoters captured by the EMF- 
capturing vector as described above. 

[0096] The ORF can be Identified by extracting characteristics common to individual ORFs, constructing a general 
30 model based on these characteristics, and measuring the conformity of the subject sequence with the model. In the 
identification, a software, such as GeneMark {Nuc. Acids. Res., 22:. 4756-67 (1994): manufactured by GenePro)), 
GeneMark.hmm (manufactured by GenePro), GeneHacker {Protein, Nucleic Acid and Enzyme, 42: 3001-07 (1997)), 
Glimmer (A/uc. Acids. Res., 26: 544-548 (1998): manufactured by The Institute of Genomic Research), or the like, can 
be used. In using the software, the default (initial setting) parameters are usually used, though the parameters can be 
35 optionally changed. 

[0097] In the above-described comparisons, a computer, such as UNIX, PC, Macintosh, or the like, can be used. 
[0098] Examples of the ORFdetemnined by the method of the present invention include ORFs having the nucleotide 
sequences represented by SEQ ID N0S:2 to 3501 present in the genome of Corynebacterium glutamicum as repre- 
sented by SEQ ID N0:1 . In these ORFs, polypeptides having the amino acid sequences represented by SEQ ID NOS: 

40 3502 to 7001 are encoded. 

[0099] The function of an ORF can be determined by comparing the Identified amino acid sequence of the ORF with 
known homologous sequences using a homology searching software or comparator, such as BLAST, FAST, Smith & 
Waterman [Meth. Enzym., 164: 765 (1988)) orthe like on an amino acid data base, such as Swith-Prot, PIR, GenBank- 
nr-aa, GenPept constituted by protein-encoding domains derived from GenBank data base, OWL orthe tike. 

45 [0100] Furthemnore, by the homology searching, the identity and similarity with the amino acid sequences of known 
proteins can also be analyzed. 

[0101] With respect of the term "identity" used herein, where two polypeptides each having 10 amino acids are 
different in the positions of 3 amino acids, these polypeptides have an identity of 70% with each other. In case wherein 
one of the different 3 amino acids is analogue (for example, leucine and isoleucine), these polypeptides have a similarity 
50 of 80%. 

[0102] As a specific example, Table 1 shows the registration numbers in known data bases of sequences which are 
judged as having the highest similarity with the nucleotide sequence of the ORF derived from Corynebacterium giutami- 
cum ATCG 13032, genes of these sequences, functions of these genes, and Identities thereof compared with known 
amino acid translation sequences, 
55 [0103] Thus, a great number of novel genes derived from corynefomri bacteria can be identified by detemnining the 
full nucleotide sequence of the genome derived from coryneform bacterium by the means of the present invention. 
Moreover, the function of the proteins encoded by these genes can be detemnlned. Since corynefomn bacteria are 
industrially highly useful microorganisms, many of the Identified genes are industrially useful. 
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[0104] Moreover, the characteristics of respective microorganisms can be clarified by classifying the functions thus 
determined. As a result, valuable information in breeding is obtained. 

[0105] Furthemiore, from the ORF information derived from coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the general method as disclosed in Molecular Cloning, 2nd ed. 

5 or the til<e. Specifically, an oligonucleotide having a nucleotide sequence adjacent to the ORF is synthesized, and the 
ORF can be isolated and obtained using the oligonucleotide as a primer and a chromosome DNA derived from co- 
rynefomn bacteria as a template according to the general PGR cloning technique. Thus obtained ORF sequences 
include polynucleotides comprising the nucleotide sequence represented by any one of SEQ ID NOS:2 to 3501 . 
[01 06] The ORF or primer can be prepared using a polypeptide synthesizer based on the above sequence informa- 

10 tion. 

[0107] Examples of the polynucleotide of the present invention Include a polynucleotide containing the nucleotide 
sequence of the ORF obtained In the above, and a polynucleotide which hybridizes with the polynucleotide under 
stringent conditions. 

[0108] The polynucleotide of the present Invention can be a singie-stranded DNA, a double-stranded DNA and a 

15 single-stranded RNA, though it is not limited thereto. 

[0109] The polynucleotide which hybridizes with the polynucleotide containing the nucleotide sequence of the ORF 
obtained In the above under stringent conditions Includes a degenerated mutant of the ORF. A degenerated mutant is 
a polynucleotide fragment having a nucleotide sequence which is different from the sequence of the ORF of the present 
Invention which encodes the same amino acid sequence by degeneracy of a gene code. 

20 [0110] Specific examples Include a polynucleotide comprising the nucleotide sequence represented by any one of 
SEQ ID N0S:2 to 3431, and a polynucleotide which hybridizes with the polynucleotide under stringent conditions; 
[0111] A polynucleotide which hybridizes under stringent conditions Is a polynucleotide obtained by colony hybridi- 
zation, plaque hybridization, Southern blot hybridization or the like using, as a probe, the polynucleotide having the 
nucleotide sequence of the ORF identified In the above. Specific examples include a polynucleotide which can be 

25 identified by carrying out hybridization at 65*C in the presence of 0.7-1 .0 M NaCI using a filter on which a polynucleotide 
prepared from colonies or plaques Is immobilized, and then washing the filter with 0.1x to 2x SSC solution (the com- 
position of Ix SSC contains 150 mM sodium chloride and 15 mM sodium citrate) at 65*0. 

[0112] The hybridization can be carried out in accordance with known methods described In, for example, Molecular 
Cloning, 2nd ed., Current Protocols in Molecular Biology, DNA Cloning 1: Core Techniques, A Practical Approach, 

30 Second Edition, Oxford University (1 995) or the like. Specific examples of the polynucleotide which can be hybridized 
include a DNA having a homology of 60% or more, preferably 80% or more, and particularly preferably 95% or more, 
with the nucleotide sequence represented by any one of SEQ ID N0:2 to 3431 when calculated using default (initial 
setting) parameters of a homology searching software, such as BLAST, FASTA, Smlth-Watemnan or the like. 
[01 1 3] Also, the polynucleotide of the present invention includes a polynucleotide encoding a polypeptide comprising 

35 the amino acid sequence represented by any one of SEQ ID NOS:3502 to 6931 and a polynucleotide which hybridizes 
with the polynucleotide under stringent conditions. 

[01 14] Furthermore, the polynucleotide of the present Invention Includes a polynucleotide which is present in the 5' 
upstream or 3* downstream region of a polynucleotide comprising the nucleotide sequence of any one of SEQ ID NOS: 
2 to 3431 In a polynucleotide comprising the nucleotide sequence represented by SEQ ID N0:1 , and has an activity 
40 of regulating an expression of a polypeptide encoded by the polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a polypeptide encoded by the polynucleotide Includes a polynucleotide 
encoding the above described EMF, such as a promoter, an operator, an enhancer, a silencer, a ribosome-blnding 
sequence, a transcriptional termination sequence, and the like, 

[0115] The primer used for obtaining the ORF according to the above PGR cloning technique includes an ollgonu- 
45 cleotide comprising a sequence which is the same as a sequence of 1 0 to 200 continuous nucleotides in the nucleotide 
sequence of the ORF and an adjacent region or an oligonucleotide comprising a sequence which is complementary 
to the oligonucleotide. Specific examples Include an oligonucleotide comprising a sequence which is the same as a 
sequence of 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID N0S:1 
to 3431 , and an oligonucleotide comprising a sequence complementary to the oligonucleotide comprising a sequence 
50 of at least 1 0 to 20 continuous nucleotide of any one of SEQ ID NOS: 1 to 3431 . When the primers are used as a sense 
primer and an antlsense primer, the above-described oligonucleotides in which melting temperature (T^^) and the 
number of nucleotides are not significantly different from each other are preferred. 

[01 16] The oligonucleotide of the present Invention Includes an oligonucleotide comprising a sequence which is the 
same as 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID N0S;1 to 
55 3431 or an oligonucleotide comprising a sequence complementary to the oligonucleotide. 

[0117] Also, analogues of these oligonucleotides (hereinafter also referred to as "analogous oligonucleotides") are 
also provided by the present invention and are useful In the methods described herein. 

[0118] Examples of the analogous oligonucleotides Include analogous oligonucleotides In which a phosphodiester 
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bond in an oligonucleotide is converted to a phosphorothioate bond, analogous oligonucleotides In which a phosphodi- 
ester bond in an oligonucleotide is converted to an N3'*P5' phosphoamldate bond, analogous oligonucleotides In which 
ribose and a phosphodlester bond in an oligonucleotide is converted to a peptide nucleic acid bond, analogous oligo- 
nucleotides in which uracil in an oligonucleotide is replaced with C-5 propynyluracll, analogous oligonucleotides in 

s which uracil in an oligonucleotide Is replaced with C-5 thiazoluracll, analogous oligonucleotides In which cytosine in 
an oligonucleotide is replaced with C-5 propynylcytoslne, analogous oligonucleotides In which cytosine in an oligonu- 
cleotide is replaced with phenoxazine-modified cytosine, analogous oligonucleotides in which ribose in an oligonucle- 
otide Is replaced with 2'-0-propylribose, analogous oligonucleotides In which ribose in an oligonucleotide Is replaced 
with 2'-methoxyethoxyribose, and the like {Cell Engineering, tft 1463 (1997)). 

10 [0119] The above oligonucleotides and analogous oligonucleotides of the present invention can be used as probes 
for hybridization and antisense nucleic acids described below in addition to as primers. 

[0120] Examples of a primer for the antisense nucleic acid techniques known in the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present invention under stringent conditions and has an activity regulating 
expression of the polypeptide encoded by the polynucleotide, in addition to the above oligonucleotide. 

15 

3. Detennination of isozymes 

[0121] Many mutants of coryneform bacteria which are useful in the production of useful substances, such as amino 
acids, nucleic acids, vitamins, saccharides, organic acids, and the like, are obtained by the present invention. 
20 [01 22] However, since the gene sequence data of the microorganism has been, to date, insufficient, useful mutants 
have been obtained by mutagenic techniques using a mutagen, such as nitrosoguanidine (NTG) or the like. 
[0123] Although genes can be mutated randomly by the mutagenic method using the above-described mutagen, all 
genes encoding respective isozymes having similar properties relating to the metabolism of intermediates cannot be 
mutated. In the mutagenic method using a mutagen, genes are mutated randomly. Accordingly, hamnful mutations 
. 25 worsening culture characteristics, such as delay In growth, accelerated foaming, and the like, might be imparted at a 
great frequency, In a random manner. 

[01 24] However, if gene sequence information is available, such as is provided by the present invention, it is possible 
to mutate all of the genes encoding target isozymes. In this case, hannful mutations may be avoided and the target 
mutation can be incorporated. 

30 [01 25] Namely, an accurate number and sequence inf onnatlon of the target isozymes In coryneform bacteria can be 
obtained based on the ORF data obtained In the above item 2. By using the sequence information, all of the target 
isozyme genes can be mutated into genes having the desired properties by, for example, the site-specific mutagenesis 
method described in Molecular Cloning, 2nd ed. to obtain useful mutants having elevated productivity of useful sub- 
stances. 

35 

4. Clarification or determination of biosynthesis pathway and signal transmission pathway 

[01 26] Attempts have been made to elucidate biosynthesis pathways and signal transmission pathways in a number 
of organisms, and many findings have been reported. However, there are many unknown aspects of coryneform bac- 
40 teria since a number of genes have not been identified so far. 

[0127] These unknown points can be clarified by the following method. 

[01 28] The functional infomiation of ORF derived from corynefonn bacteria as Identified by the method of above item 
2 is arranged. The term "arranged" means that the ORF is classified based on the biosynthesis pathway of a substance 
or the signal transmission pathway to which the ORF belongs using known information according to the functional 
45 inf onnatlon. Next, the arranged ORF sequence information is compared with enzymes on the biosynthesis pathways 
or signal transmission pathways of other known organisms. The resulting information is combined with known data on 
corynefonm bacteria. Thus, the biosynthesis pathways and signal transmission pathways in corynefonn bacteria, which 
have been unknown so far, can be determined. 

[0129] As a result that these pathways which have been unknown or unclear hitherto are clarified, a useful mutant 
so for producing a target useful substance can be efficiently obtained. 

[0130] When the thus clarified pathway is judged as important in the synthesis of a useful product, a useful mutant 
can be obtained by selecting a mutant wherein this pathway has been strengthened. Also, when the thus clarified 
pathway is judged as not important in the biosynthesis of the target useful product, a useful mutant can be obtained 
by selecting a mutant wherein the utilization frequency of this pathway is lowered. 

S5 

5. Clarification or determination of useful mutation point 

[01 31 ] Many useful mutants of coryneform bacteria which are suitable for the production of useful substances, such 
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as amino acids, nucleic acids, vitanfiins, saccharides, organic acids, and the like, have been obtained. However, it is 
hardly known which mutation point is Imparted to a gene to Improve the productivity. 

[0132] IHowever, mutation points contained in production strains can be identified by comparing desired sequences 
of the genome DNA of the production strains obtained from corynefonn bacteria by the mutagenic technique with the 
5 nucleotide sequences of the corresponding genome DNA and ORF derived from coryneform bacteria determined by 
the methods of the above Items 1 and 2 and analyzing them 

[0133] Moreover, effective mutation points contributing to the production can be easily specified from among these 
mutation points on the basis of known infonnation relating to the metabolic pathways, the metabolic regulatory mech- 
anisms, the structure activity correlation of enzymes, and the like. 
10 [01 34] When any efficient mutation can be hardly specified based on known data, the mutation points thus identified 
can be Introduced into a wild strain of coryneform bacteria or a production strain free of the mutation. Then, it Is examined 
whether or not any positive effect can be achieved on the production, 

[0135] For example, by comparing the nucleotide sequence of homoserine dehydrogenase gene horn of a lyslne- 
produclng B-6 strain of Corynebacterium glutamicum {AppL Microbiol. BfotechnoL, 32: 269-273 (1989)) with the nu- 
15 cleotide sequence corresponding to the genome of Corynebacterium glutamicum ATCC 1 3032 according to the present 
Invention, a mutation of amino acid replacement in which valine at the 59-posltion Is replaced with alanine (Val59Ala) 
was Identified. A strain obtained by introducing this mutation into the ATCC 13032 strain by the gene replacement 
method can produce lysine, which indicates that this mutation Is an effective mutation contributing to the production 
of lysine. 

20 [0136] Similarly, by comparing the nucleotide sequence of pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence con'esponding to the ATCC 1 3032 genome, a mutation of amino acid replacement in which proline 
at the 458-position was replaced with serine (Pro458Ser) was identified. A strain obtained by introducing this mutation 
Into a lysine-producing strain of No. 58 (FERI\/1 BP-7134) of Corynebacterium glutamicum Uee of this mutation shows 
an improved lysine productivity in comparison with the No. 58 strain, which indicates that this mutation Is an effective 

25 mutation contributing to the production of lysine. 

[0137] In addition, a mutation A1a213Thr in g1ucose-6-phosphate dehydrogenase was specified as an effective mu- 
tation relating to the production of lysine by detecting gIucose-6-phosphate dehydrogenase gene zwfot the B-6 strain. 
[0138] Furthennore, the lysine-productlvity of Corynebacterium glutamicum was improved by replacing the base at 
the 932-positlon of aspartokinase gene iysC of the Corynebacterium glutamicum ATCC 13032 genome with cytoslne 

30 to thereby replace threonine at the 31 1 -position by isoleuclne, which Indicates that this mutation is an effective mutation 
contributing to the production of lysine. 

[0139] Also, as another method to examine whether or not the identified mutation point is an effective mutation, there 
is a method in which the mutation possessed by the lysine-producing strain is returned to the sequence of a wild type 
strain by the gene replacement method and whether or not It has a negative Influence on the lysine productivity. For 
35 example, when the amino acid replacement mutation Val59Ala possessed by hom of the lysine-producing B-6 strain 
was returned to a wild type amino acid sequence, the lysine productivity was lowered in comparison with the B-6 strain. 
Thus, it was found that this mutation is an effective mutation contributing to the production of lysine. 
[01 40] Effective mutation points can be more efficiently and comprehensively extracted by combining, if needed, the 
DNA array analysis or proteome analysis described below. 

40 

6. Method of breeding Industrially advantageous production strain 

[0141] It has been a general practice to construct production strains, which are used industrially in the fermentation 
production of the target useful substances, such as amino acids, nucleic acids, vitamins, saccharides, organic acids, 
45 and the like, by repeating mutagenesis and breeding based on random mutagenesis using mutagens, such as NTG 
or the like, and screening. 

[0142] In recent years, many examples of improved production strains have been made through the use of recom- 
binant DNA techniques. In breeding, however, most of the parent production strains to be improved are mutants ob- 
tained by a conventional mutagenic procedure (W. Leuchtenberger, Amino Acids - Technical Production and Use. In: 
50 Roehr (ed) Biotechnology, second edition, vol. 6, products of primary metabolism. VCH Verlagsgesellschaft mbH, Wein- 
heim, P 465 (1996)). 

[0143] Although mutagenesis methods have largely contributed to the progress of the femientatlon Industry, they 
suffer from a serious problem of multiple, random introduction of mutations into every part of the chromosome. Since 
many mutations are accumulated in a single chromosome each time a strain is improved, a production strain obtained 
S5 by the random mutation and selecting is generally inferior in properties (for example, showing poor growth, delayed 
consumption of saccharides, and poor resistance to stresses such as temperature and oxygen) to a wild type strain, 
which brings about troubles such as failing to establish a sufficiently elevated productivity, being frequently contami- 
nated with miscellaneous bacteria, requiring troublesome procedures In culture maintenance, and the like, and, In Its 
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turn, elevating the production cost in practice. In addition, the innprovement in the productivity is based on random 
mutations and thus the mechanism thereof is unclear. Therefore, It Is very difficult to plan a rational breeding strategy 
for the subsequent improvement in the productivity. 

[0144] According to the present invention, effective mutation points contributing to the production can be efficiently 
5 specified from among many mutation points accumulated In the chromosome of a production strain which has been 
bred from coryneform bacteria and, therefore, a novel breeding method of assembling these effective mutations in the 
corynefonn bacteria can be established. Thus, a useful production strain can be reconstructed. It Is also possible to 
construct a useful production strain from a wild type strain. 
[0145] Specifically, a useful mutant can be constmcted in the following manner. 
10 [0146] One of the mutation points is Incorporated Into a wild type strain of coryneform bacteria. Then, it is examined 
whether or not a positive effect is established on the production. When a positive effect is obtained, the mutation point 
is saved. When no effect is obtained, the mutation point is removed. Subsequently, only a strain having the effective 
mutation point is used as the parent strain, and the same procedure is repeated. In general, the effectiveness of a 
mutation positioned upstream cannot be clearly evaluated in some cases when there Is a rate-detennining point In the 
15 downstream of a biosynthesis pathway It is therefore preferred to successively evaluate mutation points upward from 
downstream. 

[0147] By reconstituting effective mutations by the method as described above In a wild type strain or a strain which 
has a high growth speed or the same ability to consume saccharides as the wild type strain, it is possible to construct 
an industrially advantageous strain which is free of troubles in the previous methods as described above and to conduct 

20 femnentation production using such strains within a short time or at a higher temperature. 

[0148] For example, a lysine-producing mutant B-6 (AppL Microbiol Biotechnof., 32: 262-273 (1989)), which is ob- 
tained by multiple rounds of random mutagenesis from a wild type strain Corynebacterium glutamicum ATCC 13032, 
enables lysine fermentation to be performed at a temperature between 30 and 34'C but shows lowered growth and 
lysine productivity at a temperature exceeding 34*C. Therefore, the fermentation temperature should be maintained 

25 at 34'»C or lower. In contrast thereto, the production strain described in the above Item 5, which is obtained by recon- 
stituting effective mutations relating to lysine production, can achieve a productivity at 40 to 42**C equal or superior to 
the result obtained by culturing at 30 to 34*C. Therefore, this strain is Industrially advantageous since it can save the 
load of cooling during the fermentation. 

[0149] When culture should be candied out at a high temperature exceeding 43^C, a production strain capable of 
30 conducting femnentation production at a high temperature exceeding 43^C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus Corynebacterium which can grow at high temperature exceeding 
43®C. Examples of the microorganism capable of growing at a high temperature exceeding 43*C Include Corynebac- 
terium thermoaminogenes, such as Corynebacterium thermoaminogenes PERM 9244, PERM 9245, PERM 9246 and 
PERM 9247, 

35 [0150] A strain having a further improved productivity of the target product can be obtained using the thus recon- 
structed strain as the parent strain and further breeding it using the conventional mutagenesis method, the gene am- 
plification method, the gene replacement method using the recombinant DNA technique, the transduction method or 
the cell fusion method. Accordingly, the microorganism of the present Invention includes, but is not limited to, a mutant, 
a cell fusion strain, a transformant, a transductant or a recombinant strain constructed by using recombinant DNA 

40 techniques, so long as it Is a producing strain obtained via the step of accumulating at least two effective mutations In 
a corynefomn bacteria in the course of breeding. 

[0151] When a mutation point judged as being harmful to the growth or production Is specified, on the other hand, 
it is examined whether or not the producing strain used at present contains the mutation point. When It has the mutation, 
it can be returned to the wild type gene and thus a further useful production strain can be bred. 
45 [0152] The breeding method as described above Is applicable to microorganisms, other than corynefomn bacteria, 
which have industrially advantageous properties (for example, microorganisms capable of quicldy utilizing less expen- 
sive carbon sources, microorganisms capable of growing at higher temperatures). 

7, Production and utilization of polynucleotide array 

50 

(1) Production of polynucleotide array 

[01 53] A polynucleotide array can be produced using the polynucleotide or oligonucleotide of the present invention 
obtained in the above items 1 and 2. 
55 [0154] Examples include a polynucleotide array comprising a solid support to which at least one of a polynucleotide 
comprising the nucleotide sequence represented by SEQ ID N0S:2 to 3501 , a polynucleotide which hybridizes with 
the polynucleotide under stringent conditions, and a polynucleotide comprising 10 to 200 continuous nucleotides in 
the nucleotide sequence of the polynucleotide is adhered; and a polynucleotide array comprising a solid support to 
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which at least one of a polynucleotide encoding a polypeptide comprising the amino acid sequence represented by 
any one of SEQ ID NOS:3502 to 7001, a polynucleotide which hybridizes with the polynucleotide under stringent 
conditions, and a polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequences of the polynu- 
cleotides is adhered. 

5 [0155] Polynucleotide arrays of the present Invention include substrates known In the art, such as a DNA chip, a 
DNA microarray and a DNA macroarray, and the like, and comprises a solid support and plural polynucleotides or 
fragments thereof which are adhered to the surface of the solid support. 
[0156] Examples of the solid support include a glass plate, a nylon membrane, and the like. 
[0157] The polynucleotides or fragments thereof adhered to the surface of the solid support can be adhered to the 

10 surface of the solid support using the general technique for preparing arrays. Namely, a method In which they are 
adhered to a chemically surface-treated solid support, for example, to which a polycation such as polyiysine or the like 
has been adhered (Nat Genet, 21; 15-1 9 (1 999)). The chemically surface-treated supports are commercially available 
and the commercially available solid product can be used as the solid support of the polynucleotide array according 
to the present invention. 

15 [0158] As the polynucleotides or oligonucleotides adhered to the solid support, the polynucleotides and oligonucle- 
otides of the present invention obtained in the above items 1 and 2 can be used. 

[01 59] The analysis described below can be efficiently performed by adhering the polynucleotides or oligonucleotides 

to the solid support at a high density, though a high fixation density is not always necessary. 

[0160] Apparatus for achieving a high fixation density, such as an arrayer robot or the like, Is commercially available 

20 from Takara Shuzo (GMS41 7 Arrayer), and the commercially available product can be used. 

[0161] Also, the oligonucleotides of the present Invention can be synthesized directly on the solid support by the 
photolithography method or the like (Nat Genet, 21: 20-24 (1999)), In this method, a linker having a protective group 
which can be removed by light irradiation is first adhered to a solid support, such as a slide glass or the like. Then, it 
Is irradiated with light through a mask (a photollthograph mask) permeating tight exclusively at a definite part of the 

25 adhesion part. Next, an oligonucleotide having a protective group which can be removed by tight irradiation is added 
to the part. Thus, a ligation reaction with the nucleotide arises exclusively at the irradiated part. By repeating this 
procedure, oligonucleotides, each having a desired sequence, different from each other can be synthesized in respec- 
tive parts. Usually, the oligonucleotides to be synthesized have a length of 10 to 30 nucleotides. 

30 (2) Use of polynucleotide array 

[0162] The following procedures (a) and (b) can be carried out using the polynucleotide array prepared in the above 
(1). 

35 (a) Identification of mutation point of coryneform bacterium mutant and analysis of expression amount and expression 
profile of gene encoded by genome 

[0163] By subjecting a gene derived from a mutant of coryneform bacteria or an examined gene to the following 
steps (i) to (iv), the mutation point of the gene can be identified or the expression amount and expression profile of the 
40 gene can be analyzed: 

(i) producing a polynucleotide array by the method of the above (1); 

(ii) incubating polynucleotides immobilized on the polynucleotide array together with the labeled gene derived from 
a mutant of the coryneform bacterium using the polynucleotide array produced in the above (i) under hybridization 

45 conditions; 

(iii) detecting the hybridization; and 

(iv) analyzing the hybridization data. 

[0164] The gene derived from a mutant of coryneform bacteria or the examined gene include a gene relating to 
50 biosynthesis of at least one selected from amino acids, nucleic acids, vitamins, saccharides, organic acids, and ana- 
logues thereof. 

[01 65] The method will be described in detail. 

[0166] A single nucleotide polymorphism (SNP) In a human region of 2,300 kb has been identified using polynucle- 
otide arrays (Science, 280'. 1 077-82 {1 998)). in accordance with the method of identifying SNP and methods described 
55 in Science, 278: 680-686 (1 997); Proc. Nati. Acad Sci. USA, 96: 12833-38 (1 999); Science, 284: 1 520-23 (1 999), and 
the like using the polynucleotide array produced in the above (1 ) and a nucleic acid molecule (DNA, RNA) derived from 
corynefomi bacteria In the method of the hybridization, a mutation point of a useful mutant, which Is useful In producing 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, or the like can be Identified and the gene 
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expression amount and the expression profile thereof can be analyzed. 

[0167] The nucleic acid molecule (DNA, RNA) derived from the coryneform bacteria can be obtained according to 
the general method described in Molecular Cloning, 2nd ed. or the like. mRNA derived from Corynebacterium glutaml- 
cum can also be obtained by the method of Bormann ef a/. (Molecular Microbiology, &. 31 7-326 (1 992)) or the like. 
5 [0168] Although ribosomal RNA (rRNA) Is usually obtained in large excess in addition to the target mRNA, the anal- 
ysis is not seriously disturbed thereby, 

[0169] The resulting nucleic acid molecule derived from coryneform bacteria Is labeled. Labeling can be carried out 
according to a method using a fluorescent dye, a method using a radioisotope or the like. 

[0170] Specific examples include a labeling method in which psoralen-biotin is crosslinked with RNA extracted from. 

10 a microorganism and, after hybridization reaction, a fluorescent dye having streptoavidin bound thereto is bound to 
the biotin moiety {Nat. BiotechnoL, 16: 45-48 (1998)); a labeling method in which a reverse transcription reaction Is 
carried out using RNA extracted from a microorganism as a template and random primers as primers, and dUTP having 
a fluorescent dye (for example, Cy3, Cy5) (manufactured by Amersham Pharmacia Biotech) Is incorporated into cDNA 
(Proc. Natl. Acad. Sci. USA, 9&. 12833-38 (1999)); and the like. 

IS [0171] The labeling specificity can be improved by replacing the random primers by sequences complementary to 
the 3'-end of ORF (J. Bacterloi, 181: 6425-40 (1999)). 

[0172] In the hybridization method, the hybridization and subsequent washing can be carried out by the general 
method (Nat BloctechnoL, 14: 1675-80 (1996), or the like). 

[0173] Subsequently, the hybridization Intensity is measured depending on the hybridization amount of the nucleic 
20 acid molecule used In the labeling. Thus, the mutation point can be Identified and the expression amount of the gene 
can be calculated. 

[01 74] The hybridization intensity can be measured by visualizing the fluorescent signal, radioactivity, luminescence 
dose, andthe like, using a laser confocal microscope, a CCD camera, a radiation Imaging device (for example, STORM 
manufactured by Amersham Pharmacia Biotech), and the like, and then quantifying the thus visualized data. 
25 [01 75] A polynucleotide array on a solid support can also be analyzed and quantified using a commercially available 
apparatus, such as GMS418 An-ay Scanner (manufactured by Takara Shuzo) or the like, 

[01 76] The gene expression amount can be analyzed using a commercially available software (for example, ImaGene 
manufactured by Takara Shuzo; Array Gauge manufactured by Fuji Photo Film; ImageQuant manufactured by Amer- 
sham Phamnacia Biotech, or the like). 
30 [0177] A fluctuation in the expression amount of a specific gene can be monitored using a nucleic acid molecule 
obtained in the time course of culture as the nucleic acid molecule derived from corynefonn bacteria. The culture 
conditions can be optimized by analyzing the fluctuation. 

[0178] The expression profile of the microorganism at the total gene level (namely, which genes among a" great 
number of genes encoded by the genome have been expressed and the expression ratio thereof) can be detemiined 
35 using a nucleic acid molecule having the sequences of many genes detemnined from the full genome sequence of the 
microorganism. Thus, the expression amount of the genes detemiined by the full genome sequence can be analyzed 
and, in its turn, the biological conditions of the microorganism can be recognized as the expression pattern at the full 
gene level. 

40 (b) Conflnnation of the presence of gene homologous to examined gene In corynefonn bacteria 

[0179] Whether or not a gene homologous to the examined gene, which is present in an organism other than co- 
rynefonn bacteria, is present in coryneform bacteria can be detected using the polynucleotide array prepared In the 
above (1). 

45 [0180] This detection can be earned out by a method in which an examined gene which is present in an organism 
other than corynefonn bacteria is used instead of the nucleic acid molecule derived from coryneform bacteria used In 
the above identification/analysis method of (1). 

B. Recording medium storing full genome nucleotide sequence and ORF data and being readable by a computer and 
50 methods for using the same 

[0181] The term "recording medium or storage device which is readable by a computer* means a recording medium 
or storage medium which can be directly readout and accessed with a computer. Examples include magnetic recording 
media, such as a floppy disk, a hard disk, a magnetic tape, and the like; optical recording media, such as CD-R0I\4, 
55 CD-R, CD-RW. DVD-ROM. DVD-RAM, DVD-RW, and the like; electric recording media; such as RAM, ROM, and the 
like; and hybrids in these categories (for example, magnetic/optical recording media, such as MO and the like). 
[01 82] Instruments for recording or Inputting in or on the recording medium or instruments or devices for reading out 
the Infomnatlon in the recording medium can be appropriately selected, depending on the type of the recording medium 
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and the access device utilized. Also, various data processing programs, software, comparator and fomiats are used 
for recording and utilizing the polynucleotide sequence Infonnation or the like, of the present invention in the recording 
medium. The Information can be expressed In the form of a binary file, a text file or an ASCII file fomnatted with com- 
mercially available software, for example. Moreover, software for accessing the sequence Information is available and 

5 known to one of ordinary skill in the art. 

[01 83] Examples of the Information to be recorded In the above-described medium Include the full genome nucleotide 
sequence Infonnation of corynefonn bacteria as obtained in the above item 2, the nucleotide sequence Information of 
ORR the amino acid sequence Infonnation encoded by the ORF, and the functional Infonnation of polynucleotides 
coding for the amino acid sequences. 

10 [0184] The recording medium or storage device which is readable by a computer according to the present invention 
refers to a medium in which the information of the present invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer storing the nucleotide sequence Information represented 
by SEQ ID N0S:1 to 3501, the amino acid sequence Infonnation represented by SEQ ID NOS:3502 to 7001, the 
functional information of the nucleotide sequences represented by SEQ ID N0S:1 to 3501 , the functional information 

15 of the amino acid sequences represented by SEQ ID NOS:3502 to 7001 , and the Information listed in Table 1 below 
and the like. 

9. System based on a computer using the recording medium of the present Invention which is readable by a computer 

20 [0185] The temn "system based on a computer" as used herein refers a system composed of hardware devlce(s), 
software device(s), and data recording device(s) which are used for analyzing the data recorded in the recording me- 
dium of the present invention which is readable by a computer. 

[01 86] The hardware device(s) are, for example, composed of an input unit, a data recording unit, a central processing 
unit and an output unit collectively or individually 

25 [0187] By the software device(s), the data recorded in the recording medium of the present Invention are searched 
or analyzed using the recorded data arid the hardware devlce(s) as described herein. Specif icaiiy, the software device 
(s) contain at least one program which acts on or with the system in order to screen, analyze or compare blologicaiiy 
meaningful structures or infomiation from the nucleotide sequences, amino acid sequences and the like recorded In 
the recording medium according to the present invention. 

30 [0188] Examples of the software devlce(s) for Identifying ORF and EMF domains include GeneMark (Nuc. Acids, 
Res., 22, 4756-67 {1 994)), GeneHacker (Protein, Nucleic Acid and Enzyme, 42: 3001 -07 (1 997)), Glimmer (The Insti- 
tute of Genomic Research; Nuc, Acids. Res., 26: 544-548 (1 998)) and the like, in the process of using such a software 
device, the default (initial setting) parameters are usually used, although the parameters can be changed, if necessary, 
in a manner known to one of ordinary skill in the art. 

35 [0189] Examples of the software devlce(s) for identifying a genome domain or a polypeptide domain analogous to 
the target sequence or the target structural motif (homology searching) include FASTA, BLAST, Smith-Waterman, 
GenetyxMac (manufactured by Software Development), GCG Package (manufactured by Genetic Computer Group), 
GenCore (manufactured by Compugen), and the like. In the process of using such a software device, the default (Initial 
setting) parameters are usually used, although the parameters can be changed, if necessary, in a manner known to 

40 one of ordinary skill In the art. 

[0190] Such a recording medium storing the full genome sequence data is useful In preparing a polynucleotide array 
by which the expression amount of a gene encoded by the genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can be determined. 

45 [0191] The data recording device(s) provided by the present invention are, for example, memory devlce(s) for re- 
cording the data recorded in the recording medium of the present invention and target sequence or target structural 
motif data, or the like, and a memory accessing device(s) for accessing the same. 

[0192] Namely, the system based on a computer according to the present invention comprises the following: 

50 (1) a user input device that Inputs the information stored In the recording medium of the present invention, and 

target sequence or target structure motif infonnation; 
(li) a data storage device for at least temporarily storing the Input infonnation; 

(ill) a comparator that compares the information stored in the recording medium of the present invention with the 
target sequence or target structure motif information, recorded by the data storing device of (ii) for screening and 
55 analyzing nucleotide sequence infonnation which is coincident with or analogous to the target sequence or target 

structure motif infonnation; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 
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[0193] This system is usable in the methods in items 2 to 5 as described above for searching and anaiyzing the ORF 
and EMF domains, target sequence, target structural motif, etc. of a corynefomn bacterium, searching homologs, 
searching and analyzing isozymes, detemninlng the biosynthesis pathway and the signal transmission pathway, and 
Identifying spots which have been found In the proteome analysis. The temn "homologs" as used herein includes both 
5 of orthologs and paralogs. 

10. Production of polypeptide using ORF derived from coryneform bacteria 

[01 94] The polypeptide of the present i nvention can be produced using a polynucleotide comprising the ORF obtained 
10 in the above Item 2, Specifically, the polypeptide of the present invention can be produced by expressing the polynu- 
cleotide of the present invention or a fragment thereof in a host ceii, using the method described in Molecular Cloning, 
2nd ed., Current Protocols in Molecular Biology, and the like, for example, according to the following method. 
[0195] A DNA fragment having a suitable length containing a part encoding the polypeptide is prepared from the full 
length ORF sequence, If necessary. 
15 [0196] Also, DNA in which nucleotides in a nucleotide sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable for expression of the host cell, if necessary. The DNA is useful for 
efficiently producing the polypeptide of the present invention, 

[0197] A recombinant vector Is prepared by inserting the DNA fragment into the downstream of a promoter in a 
suitable expression vector 
20 [01 98] The recombinant vector is introduced to a host cell suitable for the expression vector. 

[0199] Any of bacteria, yeasts, animal cells, insect cells, plant cells, and the like can be used as the host ceil so long 

as it can be expressed in the gene of interest. 

[0200] Examples of the expression vector include those which can replicate autonomously in the above-described 
host cell or can be Integrated Into chromosome and have a promoter at such a position that the DNA encoding the 
25 polypeptide of the present invention can be transcribed. 

[0201] When a procaryote cell, such as a bacterium or the like, is used as the host cell, it is preferred that the 
recombinant vector containing the DNA encoding the polypeptide of the present invention can replicate autonomously 
in the bacterium and is a recombinant vector constituted by, at least a promoter, a ribosome binding sequence, the 
DNA of the present invention and a transcription tennination sequence. A promoter controlling gene can also be con- 
so tained therewith in operable combination. 

[0202] Examples of the expression vectors include a vector plasmid which Is replicable in Corynebacterium glutaml- 
cum, such as pCGI (Japanese Published Unexamined Patent Application No. 134500/82), pCG2 (Japanese Published 
Unexamined Patent Application No. 35197/83), pCG4 (Japanese Published Unexamined Patent Application No. 
183799/82), pCG11 (Japanese Published Unexamined Patent Application No. 134500/82), pCG116, pGE54 and 
35 pCBI 01 (Japanese Published Unexamined Patent Application No. 1 05999/83), pCE51 , pCE52 and pCE53 (Moi. Gen. 
Genet, 196: 1 75-1 78 (1 984)), and the like; a vector plasmid which Is replicable in Escherichia coli, such as pET3 and 
pETII (manufactured by Stratagene), pBAD, pThioHis and pTrcHis (manufactured by Invitrogen), pKK223-3 and 
pGEX2T (manufactured by Amersham Phamnacia Biotech), and the like; and pBTrp2, pBTacI and pBTac2 (manufac- 
tured by Boehringer Mannheim Co.), pSE280 (manufactured by Invitrogen), pGEMEX-1 (manufactured by Promega), 
40 pQE-8 (manufactured by QIAGEN), pKYPIO (Japanese Published Unexamined Patent Application No. 110600/83), 
PKYP200 {Agric. Biol. Chem., 48: 669 (1984)), pLSAI (Agric. Biol. Chem., 53: 277 (1989)), pGEL1 (Proc, Natl. Acad, 
Sci. USA, 82. 4306 (1 985)), pBluescript 11 SK(-) (manufactured by Stratagene), pTrs30 (prepared from Escherichia coli 
JMia9/pTrS30 (FERM BP-5407)), pTrs32 (prepared from Escherichia co// JM109/pTrS32 (PERM BP-5408)). pGH/V2 
(prepared from Escherichia coli IGHA2 (FERM B-400), Japanese Published Unexamined Patent Application No. 
45 221 091/85), pGK/\2 (prepared from Esc/)enc/7/a CO// IGK/\2 (FERM BP-6798), Japanese Published Unexamined Patent 
Application No. 221091/85), pTerm2 (U.S. Patents 4,686,191, 4,939,094 and 5,160,735), pSupex, pUBIIO. pTP5, 
pC194 and pEG400 (J. BacterioL, 172: 2392 (1990)), pGEX (manufactured by Phamnacia), pET system (manufactured 
by Novagen), and the like. 

[0203] Any promoter can be used so long as it can function in the host cell. Examples include promoters derived 
50 from Escherichia coli, phage and the like, such as trp promoter (P^^) , lac promoter, P^ promoter, Pr promoter, T7 

promoter and the like. Also, artifk;ially designed and modified promoters, such as a promoter in which two Ptrp are 

linked In series (P+rpX2) . tac promoter, /acT7 promoter /efl promoter and the like, can be used. 

[0204] It is preferred to use a plasmid In which the space between Shine-Dalgarno sequence which Is the ribosome 

binding sequence and the initiation codon is adjusted to an appropriate distance (for example, 6 to 18 nucleotides). 
55 [0205] The transcription termination sequence is not always necessary for the expression of the DNA of the present 

invention. However, it Is preferred to arrange the transcription terminating sequence at just downstream of the structural 

gene. 

[0206] One of ordinary skill In the art will appreciate that the codons of the above-described elements may be opti- 
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mized. In a known manner, depending on the host cells and environmental conditions utilized, 
[0207] Examples of the host celi Inciude microorganisms belonging to the genus Escherichia, the genus Serratia, 
the genus Bacilius, the genus Brevibacterium, the genus Corynebacterium, the genus Microbacterium, the genus Pseu- 
domonaSf and the like. Specific examples include Escherichia coli XL1 -Blue, Escherichia coii XL2-Blue, Escherichia 

5 coli DH1 , Escherichia coll MC1000, Escherichia coll KY3276, Escherichia co// W1485, Escherichia co// JM109, Es- 
cherichia CO//HB101 , Escherichia coliNo. 49, Escherichia co// W3110, Escherichia coii NY 49, Escherichia co//GI698, 
Escherichia coii TB^, Serratia ficaria, Serratia fonticola, Serratia liquefaciens, Serratia marcescens, Bacillus subtilis, 
Bacillus amyloliquefaciens, Corynebacterium ammoriia genes, Brevibacterium immariophiium ATCC 1 4068, Brevibac- 
terium saccharolytlcumATCC 14066, Corynebacterium glutamicum JKVCC 13032, Co/y/jei?acfer/ufnglutamlcumATCC 

10 1 3869, Corynebacterium glutamicumfiiTO0 1 4067 (prior genus and species: Brevibacterium flavum), Corynebacterium 
glutamicum fiiVCC 13869 (prior genus and species: Brevibacterium iactofermentum, or Corynebacterium lactofermen- 
tum), Corynebacterium acetoacldophilum /KTCC 13870, Corynebacterium thermoaminogenes PERM 9244, Microbac- 
terium ammoniaphilum PiTCC 15354, Pseudomonas putlda, Pseudomonas sp. D-0110, and the like. 
[0208] When Corynebacterium glutamicum or an analogous microorganism Is used as a host, an EMF necessary 

15 for expressing the polypeptide is not always contained in the vector so long as the polynucleotide of the present in- 
vention contains an ElVIF. When the EMF Is not contained in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate it so as to be in operable combination. Also, when a higher expression amount or specific ex- 
pression regulation is necessary, It is necessary to ligate the EMF corresponding thereto so as to put the EMF in 
operable combination with the polynucleotide. Examples of using an externally ligated EMF are disclosed in Microbi- 

20 o/oyy, 742: 1297-1309(1996). 

[0209] With regard to the method for the Introduction of the recombinant vector, any method for introducing DNA Into 
the above-described host cells, such as a method In which a calcium ion is used (Proc. Natl. Acad. Sci. USA, 69: 2110 
(1972)), a protoplast method (Japanese Published Unexamined Patent Application No. 2483942/88), the methods 
described in Gene, 77: 107 (1982) and Molecular Sl General Genetics, 168: 111 (1979) and the like, can be used. 

25 [0210] When yeast is used as the host cell, examples of the expression vector include pYES2 (manufactured by 
Invitrogen), YEp13 (ATCC 37115), YEp24 (ATCC 37051), YCp50 (ATCC 37419), pHS19, pHS15, and the like. 
[0211] Any promoter can be used so long as it can be expressed in yeast. Examples include a promoter of a gene 
in the glycolytic pathway, such as hexose kinase and the like, PH05 promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gal 1 0 promoter, a heat shock protein promoter, MF al promoter, CUP 1 promoter, and the like. 

30 [0212] Examples of the host cell Include microorganisms belonging to the genus Saccharomyces, the genus 
Schizosaccharomyces, the genus Kluyveromyces, the genus Trichosporon, the genus Schwannlomyces, the genus 
PIchia, the genus Candida and the like. Specific examples include Saccharomyces cerevlsiae, Schizosaccharomyces 
pombe, Kluyveromyces lactis, Trichosporon pullulans, Schwannlomyces alluvius, Candida utilis and the like. 
[021 3] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA Into 

35 yeast, such as an el ectropo ration method (Methods. Enzymol., 194: 182 (1990)), a spheroplast method {Proc. Natl. 
Acad. Sci. USA, 75: 1929 (1978)), a lithium acetate method {J. BacterioL, 153: 163 (1983)), a method described in 
Proc. Natl. Acad. Sci. USA, 75: 1929 (1 978) and the like, can be used. 

[0214] When animal cells are used as the host cells, examples of the expression vector include pcDNA3.1 , pSlnRep5 
and pCEP4 (manufactured by Invitorogen), pRev-Tre (manufactured by Clontech), pAxCAwt (manufactured by Takara 
40 Shuzo), pcDNAI and pcDM8 (manufactured by Funakoshi), pAGE107 (Japanese Published Unexamined Patent Ap- 
plication No. 22979/91 ; Cytotechnology, 3:1 33 (1 990)), pAS3-3 (Japanese Published Unexamined Patent Application 
No. 227075/90), pcDM8 (Nature, 329: 840 (1987)), pcDNAI/Amp (manufactured by Invitrogen), pREP4 (manufactured 
by Invitrogen), pAGE103 (J. Biochem., 101: 1307 (1987)), pAGE210, and the like. 

[0215] Any promoter can be used so long as It can function In animal cells. Examples include a promoter of IE 
45 (immediate early) gene of cytomegalovirus (CMV), an early promoter of SV40, a promoter of retrovirus, a metal- 
lothionein promoter, a heat shock promoter, SRa promoter, and the like. Also, the enhancer of the IE gene of human 
CMV can be used together with the promoter. 

[0216] Examples of the host cell include human Namalwa cell, monkey COS cell, Chinese hamster CHO cell, 
HST5637 (Japanese Published Unexamined Patent Application No. 299/88), and the like. 
50 [0217] The method for Introduction of the recombinant vector into animal cells is not particularly limited, so long as 
it is the general method for introducing DNA into animal cells, such as an electroporation method {Cytotechnology 3: 
133 (1990)), a calcium phosphate method (Japanese Published Unexamined Patent Application No. 227075/90), a 
lipofection method (Proc, Natl. Acad. Sci. USA, 84, 7413 (1987)), the method described in Virology, 5Z. 456 (1973), 
and the like. 

55 [0218] When insect cells are used as the host cells, the polypeptide can be expressed, for example, by the method 
described in Bacurovirus Expression Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992), 
Bio/Technology, 6: 47 (1988), or the like. 

[021 9] Specif ically, a recombinant gene transfer vector and bacurovirus are simultaneously Inserted into insect cells 



23 



8/17/2007, EAST Version: 2,1. 0.14 



EP1 108 790 A2 



to obtain a recombinant virus in an insect cell culture supernatant, and then the Insect cells are infected with the resulting 
recombinant virus to express the polypeptide. 

[0220] Examples of the gene introducing vector used In the method include pBlueBac4.5, pVL1392, pVL1393 and 
pBlueBaclil (manufactured by Invitrogen), and the like. 
5 [0221] Examples of the bacurovirus include Autographa callfornlca nuclear polyhedrosis vims with which Insects of 
the family Barathra are infected, and the like. 

[0222] Examples of the insect cells include Spodoptera frugiperda oocytes Sf9 and Sf21 {Bacurovirus Expression 
Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992)), Trichoplusia n/ oocyte High'5 (manu- 
factured by Invitrogen) and the like. 
10 [0223] The method for simultaneously incorporating the above-described recombinant gene transfer vector and the 

above-described bacurovirus for the preparation of the recombinant virus include calcium phosphate method (Japanese 
Published Unexamined Patent Application No. 227075/90), lipofection method {Proc. Natl. Acad. Sci. USA, 84: 7413 
(1987)) and the like. 

[0224] When plant cells are used as the host cells, examples of expression vector include a Ti plasmid, a tobacco 
15 mosaic virus vector, and the like. 

[0225] Any promoter can be used so long as it can be expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV), rice actin 1 promoter, and the like. 

[0226] Examples of the host cells include plant cells and the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, alfalfa, rice, wheat, barley, and the like. 

20 [0227] The method for introducing the recombinant vector is not particularly limited, so long as It is the general method 
for introducing DNA into plant cells, such as the Agrobacterium method (Japanese Published Unexamined Patent 
Application No. 140885/84, Japanese Published Unexamined Patent Application No. 70080/85, WO 94/00977), the 
electroporation method (Japanese Published Unexamined Patent Application No. 251887/85), the particle gun method 
(Japanese Patents 2606856 and 2517813), and the like. 

25 [0228] The transformant of the present Invention Includes a transfomnant containing the polypeptide of the present 
invention per se rather than as a recombinant vector, that is, a transformant containing the polypeptide of the present 
invention which is integrated into a chromosome of the host, in addition to the transformant containing the above 
recombinant vector. 

[0229] When expressed in yeasts, animal cells, insect cells or plant cells, a glycopolypeptide or glycosylated polypep- 
30 tide can be obtained. 

[0230] The polypeptide can be produced by culturing the thus obtained transformant of the present invention in a 
culture medium to produce and accumulate the polypeptide of the present Invention or any polypeptide expressed 
under the control of an EMF of the present invention, and recovering the polypeptide from the culture. 
[0231] Culturing of the transformant of the present invention in a culture medium is carried out according to the 
35 conventional method as used In culturing of the host. 

[0232] When the transformant of the present invention is obtained using a prokaryote, such as Escherichia collar 
the like, or a eukaryote, such as yeast or the like, as the host, the transformant is cultured. 

[0233] Any of a natural medium and a synthetic medium can be used, so long as it contains a carbon source, a 
nitrogen source, an inorganic salt and the like which can be assimilated by the transfomiant and can perfonn culturing 
40 of the transfomiant efficiently. 

[0234] Examples of the carbon source include those which can be assimilated by the transformant, such as carbo- 
hydrates (for example, glucose, faictose, sucrose, molasses containing them, starch, starch hydrolysate, and the like), 
organic acids (for example, acetic acid, propionic acid, and the like), and alcohols (for example, ethanot, propanol, and 
the like). 

45 [0235] Examples of the nitrogen source include ammonia, various ammonium salts of inorganic acids or organic 
acids (for example, ammonium chloride, ammonium sulfate, ammonium acetate, ammonium phosphate, and the like), 
other nitrogen-containing compounds, peptone, meat extract, yeast extract, corn steep liquor, casein hydrolysate, soy- 
bean meal and soybean meal hydrolysate, various fermented cells and hydrolysates thereof, and the like. 
[0236] Examples of inorganic salt include potassium dihydrogen phosphate, dipotassium hydrogen phosphate, mag- 

50 nesium phosphate, magnesium sulfate, sodium chloride, ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

[0237] The culturing is carried out under aerobic conditions by shaking culture, submerged-aeration stin'ing culture 
or the like. The culturing temperature is preferably from 15 to 40*C, and the culturing time is generally from 16 hours 
to 7 days. The pH of the medium is preferably maintained at 3.0 to 9.0 during the culturing. The pH can be adjusted 
55 using an inorganic or organic acid, an alkali solution, urea, calcium carbonate, ammonia, or the like. 

[0238] Also, antibiotics, such as ampicillin, tetracycline, and the like, can be added to the medium during the culturing, 
if necessary. 

[0239] When a microorganism transfonned with a recombinant vector containing an Inducible promoter Is cultured, 
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an Inducer can be added to the medium, if necessary. 

[0240] For example, isopropyl-p-D-thiogalactopyranoside (IPTG) or the like can be added to the medium when a 
. microorganism transfomied with a recombinant vector containing /ac promoter is cultured, or indoleacryllc acid (lAA) 
or the like can by added thereto when a microorganism transfonned with an expression vector containing trp promoter 
5 is cultured. 

[0241] Examples of the medium used in culturing atransformant obtained using animal cells as the host cells include 
RPMI 1640 medium {The Journal of the American Medical Association, 199; 519 (1967)), Eagle's MEM medium {Sci- 
ence, 122: 501 (1952)), Dulbecco's modified MEM medium {Virology, 8, 396 (1959)), 199 Medium {Proceeding of the 
Society for the Biological Medicine, Z3:1 (1 950)), the above-described media to which fetal calf serum has been added, 
10 and the like. 

[0242] The culturing is carried out generally at a pH of 6 to 8 and a temperature of 30 to 40**C in the presence of 5% 
COg for 1 to 7 days. 

[0243] Also, If necessary, antibiotics, such as kanamycin, penicillin, and the like, can be added to the medium during 
the culturing. 

15 [0244] . Examples of the medium used in culturing a transfomnant obtained using Insect ceils as the host cells include 
TNM-FH medium (manufactured by Pharmingen), Sf-900 il SFM (manufactured by Life Technologies), ExCell 400 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's Insect Medium (Nature, 195: 788 (1962)), and the like. 
[0245] The culturing is carried out generally at a pH of 6 to 7 and a temperature of 25 to SO'^C for 1 to 5 days. 
[0246] Additionally, antibiotics, such as gentamicin and the like, can be added to the medium during the culturing, if 

20 necessary. 

[0247] A transformant obtained by using a plant cell as the host cell can be used as the cell or after differentiating 
to a plant cell or organ. Examples of the medium used in the culturing of the transformant include Murashlge and Skoog 
(MS) medium. White medium, media to which a plant hormone, such as auxin, cytoklnlne, or the like has been added, 
and the like. 

25 [0248] The culturing Is carried out generally at a pH of 5 to 9 and a temperature of 20 to 40''C for 3 to 60 days. 

[0249] Also, antibiotics, such as kanamycin, hygromycin and the like, can be added to the medium during the cul- 
turing, if necessary. 

[0250] As described above, the polypeptide can be produced by culturing a transformant derived from a microor- 
ganism, animal cell or plant cell containing a recombinant vector to which a DNA encoding the polypeptide of the 
30 present Invention has been Inserted according to the general culturing method to produce and accumulate the polypep- 
tide, and recovering the polypeptide from the culture. 

[0251] The process of gene expression may Include secretion of the encoded protein production or fusion protein 
expression and the like in accordance with the methods described in Molecular Cloning, 2nd ed., In addition to direct 
expression. 

35 [0252] The method for producing the polypeptide of the present invention includes a method of Intracellular expres- 
sion in a host cell, a method of extracellular secretion from a host cell, or a method of production on a host ceil membrane 
outer envelope. The method can be selected by changing the host cell employed or the structure of the polypeptide 
produced. 

[0253] When the polypeptide of the present invention is produced In a host cell or on a host cell membrane outer 
40 envelope, the polypeptide can be positively secreted extraceliuiarly according to, for example, the method of Paulson 
etal. {J. BioL Chem., 264: 17619 (1989)), the method of Lowe etaL (Proc. Natl. Acad, ScL USA, 86: 8227 (1989); 
Genes Develop., 4: 1 288 (1 990)), and/orthe methods described in Japanese Published Unexamined Patent Application 
No. 336963/93, WO 94/23021 , and the like. 

[0254] Specifically, the polypeptide of the present invention can be positively secreted extraceliuiarly by expressing 
45 it in the fomn that a signal peptide has been added to the foreground of a polypeptide containing an active site of the 
polypeptide of the present invention according to the recombinant DNA technique. 

[0255] Furthemnore, the amount produced can be increased using a gene amplification system, such as by use of 
a dihydrofolate reductase gene or the like according to the method described in Japanese Published Unexamined 

Patent Application No. 227075/90. 
50 [0256] Moreover, the polypeptide of the present invention can be produced by a transgenic animal Individual (trans- 
genic nonhuman animal) or plant Individual (transgenic plant). 

[0257] When the transformant is the animal individual or plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual. 
55 [0258] Examples of the method for producing the polypeptide of the present Invention using the animal individual 
include a method for producing the polypeptide of the present invention in an animal developed by inserting a gene 
according to methods known to those of ordinary skill In the art {American Journal of Clinical Nutrition, 63: 639S (1 996), 
American Journal of Clinical Nutrition, 63: 627S (1 996), Blo/Technology, 9: 830 (1 991 )). 
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[0259] In the animal Individual, the polypeptide can be produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present Invention has been inserted to produce and accumulate the polypep- 
tide in the animal, and recovering the polypeptide from the animal. Examples of the production and accumulation place 
in the animal Include milk (Japanese Published Unexamined Patent Application No. 309192/88), egg and the like of 
5 the animal. Any promoter can be used, so long as It can be expressed In the animal. Suitable examples Include an a- 
casein promoter, a (p-casein promoter, a p-lactoglobulin promoter, a whey acidic protein promoter, and the like, which 
are specific for mammary glandular cells. 

[0260] Examples of the method for producing the polypeptide of the present invention using the plant individual 
Include a method for producing the polypeptide of the present invention by cultivating a transgenic plant to which the 
10 DNA encoding the protein of the present Invention by a known method {Tissue Culture, 20 (1 994), Tissue Culture, 21 
(1 994), Trends in Biotechnology, 15: 45 (1 997)) to produce and accumulate the polypeptide In the plant, and recovering 
the polypeptide from the plant. 

[0261] The polypeptide according to the present invention can also be obtained by translation in vitro. 

[0262] The polypeptide of the present Invention can be produced by a translation system in vitro. There are, for 

IS example, two in wYro translation methods which may be used, namely, a method using RNA as a template and another 
method using DNA as a template. The template RNA includes the whole RNA, mRNA, an in wYro transcription product, 
and the like. The template DNA includes a plasmid containing a transcriptional promoter and a target gene Integrated 
therein and downstream of the initiation site, a PCR/RT-PCR product and the like. To select the most suitable system 
for the in wYro translation, the origin of the gene encoding the protein to be synthesized (prokaryotic cell/eucaryotic 

20 cell), the type of the template (DNA/RNA), the purpose of using the synthesized protein and the like should be consid- 
ered. In Wfro translation kits having various characteristics are commercially available from many companies (Boe- 
hringer Mannheim, Promega, Stratagene, or the like), and every kit can be used in producing the polypeptide according 
to the present invention. 

[0263] TranscriptlonAranslatlon of a DNA nucleotide sequence cloned into a plasmid containing a T7 promoter can 
25 be carried out using an in vitro transcription/translation system E co//T7 S30 Extract System for Circular DNA (man- 
ufactured by Promega, catalogue No. L1130). Also, transcription/translation using, as a template, a linear prokaryotic 
DNA of a supercoil non-sensitive promoter, such as /acUV5, tac, >.PL(con), XPL, or the like, can be carried out using 
an in wfro transcription/translation system E. co//S30 Extract System for Li near Templates (manufactured by Promega, 
catalogue No. L1030). Examples of the linear prokaryotic DNA used as a template include a DNA fragment, a PCR- 
30 amplified DNA product, a duplicated oligonucleotide ligation, an in v/fro transcriptional RNA, a prokaryotic RNA, and 
the like. 

[0264] In addition to the production of the polypeptide according to the present invention, synthesis of a radioactive 
labeled protein, confirmation of the expression capability of a cloned gene, analysis of the function of transcriptional 
reaction or translation reaction, and the like can be carried out using this system. 

35 [0265] The polypeptide produced by the transf onmant of the present Invention can be Isolated and purified using the 
general method for Isolating and purifying an enzyme. For example, when the polypeptide of the present Invention is 
expressed as a soluble product in the host cells, the cells are collected by centrifugation after cultivation, suspended 
in an aqueous buffer, and disrupted using an ultrasonicator, a French press, a Manton Gaulin homogenizer, a Dynomill, 
or the like to obtain a cell-free extract. From the supernatant obtained by centrifuging the cell-free extract, a purified 

40 product can be obtained by the general method used for isolating and purifying an enzyme, for example, solvent ex- 
traction, salting out using ammonium sulfate or the like, desalting, precipitation using an organic solvent, anion ex- 
change chromatography using a resin, such as diethylaminoethyl (DEAE)-Sepharose, DIAION HPA-75 (manufactured 
by Mitsubishi Chemical) or the like, cation exchange chromatography using a resin, such as S-Sepharose FF (manu- 
factured by Pharmacia) or the like, hydrophobic chromatography using a resin, such as butyl sepharose, phenyl sepha- 

45 rose or the like, gel filtration using a molecular sieve, affinity chromatography, chromatofocusing, or electrophoresis, 
such as isoelectronic focusing or the like, alone or In combination thereof. 

[0266] When the polypeptide Is expressed as an Insoluble product In the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the insoluble product of the polypeptide as the precipitate fraction. 
Next, the insoluble product of the polypeptide is solubilized with a protein denaturing agent. The solubilized solution 
50 is diluted or dialyzed to lower the concentration of the protein denaturing agent in the solution. Thus, the normal con- 
figuration of the polypeptide is reconstituted. After the procedure, a purified product of the polypeptide can be obtained 
by a purification/isolation method similar to the above. 

[0267] When the polypeptide of the present invention or its derivative (for example, a polypeptide fomned by adding 
a sugar chain thereto) is secreted out of cells, the polypeptide or its derivative can be collected in the culture supernatant. 
55 Namely, the culture supernatant is obtained by treating the culture medium in a treatment similar to the above (for 
example, centrifugation). Then, a purified product can be obtained from the culture medium using a purification/isolation 
method similar to the above. 

[0268] The polypeptide obtained by the above method is within the scope of the polypeptide of the present Invention, 
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and examples Include a polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from 
SEQ ID N0S:2 to 3431 , and a polypeptide comprising an amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931. 

[0269] Furthermore, a polypeptide comprising an amino acid sequence in which at least one amino acids is deleted, 

5 replaced, Inserted or added in the amino acid sequence of the polypeptide and having substantially the same activity 
as that of the polypeptide is Included in the scope of the present invention. The tenn "substantially the same activity 
as that of the polypeptide" means the same activity represented by the inherent function, enzyme activity or the like 
possessed by the polypeptide which has not been deleted, replaced, inserted or added. The polypeptide can be ob- 
tained using a method for introducing part-specific mutation(s) described In, for example, Molecular Cloning, 2nd ed., 

10 Current Protocols in Molecular Biology, Nuc. Acids, Res., /a 6487 (1982), Proc. Natl. Acad. Sci. USA, 79: 6409 (1982), 
Gene, 34: 315 (1985), Nuc. Adds. Res., 13: 4431 (1985), Proc. Natl. Acad. Sci. USA, 82: 488 (1985) and the like. For 
example, the polypeptide can be obtained by Introducing mutation(s) to DNA encoding a polypeptide having the amino 
acid sequence represented by any one of SEQ ID NOS:3502 to 6931 . The number of the amino acids which are deleted, 
replaced, Inserted or added Is not particularly limited; however, it Is usually 1 to the order of tens, preferably 1 to 20, 

15 more preferably 1 to 10, and most preferably 1 to 5, amino acids. 

[0270] The at least one amino acid deletion, replacement, insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to refer to that at least one amino acid is deleted, replaced, inserted 
or added to at one or plural positions in the amino acid sequence. The deletion, replacement, insertion or addition may 
be caused in the same amino acid sequence simultaneously. Also, the amino acid residue replaced, inserted or added 

20 can be natural or non-;iatural. Examples of the natural amino acid residue include L-atanlne, L-asparagine, L-asparatic 
acid, L-glutamine, L-glutamic acid, glycine, L-histidine, L-isoleucine, L-ieucine, L-lysine, L-methionine, L-phenylalanlne, 
L-proline, L-serine, L-threonine, L-tryptophan, L-tyrosine, L-valine, L-cysteine, and the like. 

[0271] Herein, examples of amino acid residues which are replaced with each other are shown below. The amino 
acid residues in the same group can be replaced with each other 

Group A: 

[0272] leucine, isoleucine, norleucine, valine, norvaline, alanine, 2-aminobutanolc acid, methionine, O-methylserlne, 
t-butylglyclne, t-butylalanine, cyclohexylalanine; 

30 

Group B: 

[0273] asparatic acid, glutamic acid, isoasparatic acid, Isoglutamic acid, 2'aminoadlplc acid, 2-amlnosuberic acid; 
35 Group C: 

[0274] asparagine, glutamine; 
Group D: 

40 

[0275] lysine, arglnlne, ornithine, 2,4-dlamlnobutanoic acid, 2,3<llamlnopropionlc acid; 
Group E: 

^5 [0276] proline, 3-hydroxyproline, 4-hydroxyproline; 
Group F: 

[0277] serine, threonine, homoserine; 

50 

Group G: 

[0278] phenylalanine, tyrosine. 

[0279] Also, in order that the resulting mutant polypeptide has substantially the same activity as that of the polypeptide 
55 which has not been mutated, it is preferred that the mutant polypeptide has a homology of 60% or more, preferably 
80% or more, and particularly preferably 95% or more, with the polypeptide which has not been mutated, when calcu- 
lated, for example, using default (initial setting) parameters by a homology searching software, such as BLAST, FASTA, 
or the like. 
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[0280] Also, the polypeptide of the present Invention can be produced by a chemical synthesis method, such as 
Fmoc (fluorenylmethyloxycarbonyl) method, tBoc (t-butyloxycarbonyl) method, or the like. It can also be synthesized 
using a peptide synthesizer manufactured by Advanced ChemTech, Perkln-Elmer, Pharmacia, Protein Technology 
Instrument, Synthecell-Vega, PerSeptive, Shimadzu Corporation, or the like. 
5 [0281] The transfonmant of the present invention can be used for objects otherthan the production of the polypeptide 
of the present invention. 

[0282] Specifically, at least one component selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an 
organic acid, and analogues thereof can be produced by culturing the transfonnant containing the polynucleotide or 
recombinant vector of the present Invention in a medium to produce and accumulate at least one component selected 
10 from amino acids, nucleic acids, vitamins, saccharides, organic acids, and analogues thereof, and recovering the same 
from the medium, 

[0283] The biosynthesis pathways, decomposition pathways and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, saccharides, organic acids and analogues thereof differ from 
organism to organism. The productivity of such a physiologically active substance can be Improved using these differ- 

15 ences, specifically by Introducing a heterogeneous gene relating to the biosynthesis thereof. For example, the content 
of lysine, which is one of the essential amino acids, in a plant seed was improved by introducing a synthase gene 
derived from a bacterium (WO 93/1 91 90). Also, arginine is excessively produced in a culture by introducing an arginlne 
synthase gene derived from Escherichia co//' (Japanese Examined Patent Publication 23750/93). 
[0284] To produce such a physiologically active substance, the transformant according to the present invention can 

20 be cultured by the same method as employed In culturing the transformant for producing the polypeptide of the present 
invention as described above. Also, the physiologically active substance can be recovered from the culture medium 
in combination with, for example, the ion exchange resin method, the precipitation method and other known methods. 
[0285] Examples of methods known to one of ordinary skill in the art include el ectropo ration, calcium transfectlon, 
the protoplast method, the method using a phage, and the like, when the host is a bacterium; and microinjection, 

25 calcium phosphate transfectlon, the positively charged lipld-medlated method and the method using a virus, and the 
like, when the host Is a eukaryote {Molecular Cloning, 2nd ed.; Specter et ai, Cells/a laboratory manual, Cold Spring 
HariDour Laboratory Press, 1998)). Examples of the host include prokaryotes, lower eukaryotes (for example, yeasts), 
higher eukaryotes (for example, mammals), and cells Isolated therefrom. As the state of a recombinant polynucleotide 
fragment present In the host cells, it can be Integrated Into the chromosome of the host. Alternatively, it can be integrated 

30 into a factor (for example, a plasmid) having an independent replication unit outside the chromosome. These trans- 
formants are usable in producing the polypeptides of the present invention encoded by the ORF of the genome of 
Corynebacterium glutamicum, the polynucleotides of the present invention and fragments thereof. Alternatively, they 
can be used In producing arisitrary polypeptides under the regulation by an EI\^F of the present Invention. 

35 11. Preparation of antibody recognizing the polypeptide of the present Invention 

[0286] An antibody which recognizes the polypeptide of the present invention, such as a polyclonal antibody, a mon- 
oclonal antibody or the like, can be produced using, as an antigen, a purified product of the polypeptide of the present 
invention or a partial fragment polypeptide of the polypeptide or a peptide having a partial amino acid sequence of the 
40 polypeptide of the present Invention. 

(1) Production of polyclonal antibody 

[0287] A polyclonal antibody can be produced using, as an antigen, a purified product of the polypeptide of the 
45 present invention, a partial fragment polypeptide of the polypeptide, or a peptide having a partial amino acid sequence 
of the polypeptide of the present Invention, and Immunizing an animal with the same. 

[0288] Examples of the animal to be immunized include rabbits, goats, rats, mice, hamsters, chickens and the like. 
[0289] A dosage of the antigen is preferably 50 to 1 00 pig per animal. 

[0290] When the peptide is used as the antigen, it Is preferably a peptide covalently bonded to a carrier protein, such 
50 as keyhole limpet haemocyanin, bovine thyrogtobulln, or the like. The peptide used as the antigen can be synthesized 
by a peptide synthesizer. 

[0291] The administration of the antigen Is, for example, carried out 3 to 10 times at the intervals of 1 or 2 weeks 

after the first administration. On the 3rd to 7th day after each administration, a blood sample is collected from the 
venous plexus of the eyeground, and It isconfinnedthatthe serum reacts with the antigen by the enzyme immunoassay 
55 (Enzyme-linked Immunosorbent Assay (ELISA), Igaku Shoin (1 976) ; Antibodies - A Laboratory Manual, Cold Spring 
IHari^or Laboratory (1988)) or the like. 

[0292] Serum Is obtained from the immunized non-human mammal with a sufficient antibody titer against the antigen 
used for the immunization, and the serum Is Isolated and purified to obtain a polyclonal antibody 
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[0293] Examples of the method for the isolation and purification Include centrifugation, salting out by 40-50% satu- 
rated ammonium sulfate, caprylic acid precipitation (Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory 
(1988)), or chromatography using a DEAE-Sepharose column, an anion exchange column, a protein A- or G-column, 
a gel filtration column, and the like, alone or In combination thereof, by methods known to those of ordinary skill in the art. 

5 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

10 [0294] A rat having a serum showing an enough antibody titer against a partial fragment polypeptide of the polypep- 
tide of the present invention used for immunization is used as a supply source of an antibody-producing cell. 
[0295] On the 3rd to 7th day after the antigen substance is finally administered the rat showing the antibody titer, the 
spleen Is excised. 

[0296] The spleen is cut to pieces In MEM medium (manufactured by Nissui Pharmaceutical), loosened using a pair 
15 of forceps, followed by centrifugation at 1 ,200 rpm for 5 minutes, and the resulting supernatant Is discarded. 

[0297] The spleen in the precipitated fraction is treated with a Tris-ammonium chloride buffer (pH 7.65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times with MEM medium, and the resulting spleen cells are used 
as antlbody-producing cells. 

20 (b) Preparation of myeloma cells 

[0298] As myeloma cells, an established cell line obtained from mouse or rat Is used. Examples of useful cell lines 
include those derived from a mouse, such as P3-X63Ag8-U1 (hereinafter referred to as "P3-U1 ") (Cum Topics in Micro- 
biol. Immunol., 81: 1 (1978); Europ. J. Immunol., 6: 511 (1976)); SP2/0-Agl4 (SP-2) (Nature, 276: 269 (1978)): 

25 P3-X63-Ag8653 (653) (J. Immunol., 123: 1 548 (1 979)); P3-X63-Ag8 (X63) cell line {Nature, 256: 495 (1 975)), and the 
like, which are 8-azaguanine-reslstant mouse (BALB/c) myeloma cell lines. These cell lines are subcultured in 8-aza- 
guanine medium (medium In which, to a medium obtained by adding 1.5 mmol/l glutamlne, 5x10-5 mol/I 2-mercap- 
toethanol, 10 |xg/ml gentamicin and 10% fetal calf serum (PCS) (manufactured by CSL) to RPMI-1640 medium (here- 
inafter referred to as the "nomriai medium"), 8-azaguanine is further added at 15 ^g/ml) and cultured in the normal 

30 medium 3 or 4 days before cell fusion, and 2x10^ or more of the cells are used for the fusion. 

(c) Production of hybridoma 

[0299] The antibody-producing cells obtained in (a) and the myeloma ceils obtained in (b) are washed with MEM 
35 medium or PBS (disodlum hydrogen phosphate: 1 .83 g, sodium dihydrogen phosphate: 0.21 g, sodium chloride: 7.65 
g, distilled water: 1 liter, pH: 7.2) and mixed to give a ratio of antibody-producing cells : myeloma cells = 5 : 1 to 10 : 
1 , followed by centrifugation at 1 ,200 rpm for 5 minutes, and the supernatant Is discarded. 

[0300] The cells in the resulting precipitated fraction were thoroughly loosened, 0.2 to 1 ml of a mixed solution of 2 
g of polyethylene glycol-1000 (PEG-1000), 2 ml of MEM medium and 0.7 ml of dimethylsulfoxide (DMSO) per 10^ 
40 antibody-producing cells Is added to the cells under stirring at 37^C, and then 1 to 2 ml of MEM medium Is further 
added thereto several times at 1 to 2 minute Intervals. 

[0301] After the addition, MEM medium Is added to give a total amount of 50 ml. The resulting prepared solution is 
centrifuged at 900 rpm for 5 minutes, and then the supernatant is discarded. The cells in the resulting precipitated 
fraction were gently loosened and then gently suspended in 1 00 ml of HAT medium (the normal medium to which 1 0"'* 
^5 mol/l hypoxanthine, 1.5x10"5 mol/l thymidine and 4x10-7 mol/I aminopterin have been added) by repeated drawing 
up into and discharging from a measuring pipette. 

[0302] The suspension Is poured into a 96 well culture plate at 100 p,l/well and cultured at 37**C for 7 to 14 days in 

a 5% CO2 incubator. 

[0303] ' After culturing, a part of the culture supernatant is recovered, and a hybridoma which specifically reacts with 
50 a partial fragment polypeptide of the polypeptide of the present invention is selected according to the enzyme immu- 
noassay described in Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory, Chapter 14(1 998) and the like. 
[0304] A specific example of the enzyme immunoassay is described below. 

[0305] The partial fragment polypeptide of the polypeptide of the present Invention used as the antigen in the immu- 
nization is spread on a suitable plate, is allowed to react with a hybridoma culturing supernatant or a purified antibody 
55 obtained in (d) described below as a first antibody, and is further allowed to react with an anti-rat or anti-mouse immu- 
noglobulin antibody labeled with an enzyme, a chemical luminous substance, a radioactive substance or the like as a 
second antibody for reaction suitable for the labeled substance. A hybridoma which specifically reacts with the polypep- 
tide of tl:ie present Invention Is selected as a hybridoma capable of producing a monoclonal antibody of the present 
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Invention. 

[0306] Cloning Is repeated using the hybrldoma twice by limiting dilution analysis (HT nnediunn (a medium in which 
amlnopterln has been removed from HAT medium) Is firstly used, and the nomial medium is secondly used), and a 
hybrldoma which is stable and contains a sufficient amount of antibody titer Is selected as a hybridoma capable of 
5 producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

[0307] The monoclonal antibody-producing hybridoma cells obtained In (c) are Injected intraperitoneaiiy Into 8- to 
10 10-week-old mice or nude mice treated with pristane (Intraperitoneal administration of 0.5 ml of 2,6, 10,1 4-tetrameth- 
ylpentadecane (pristane), followed by 2 weeks of feeding) at 5x10^ to 20x10^ cells/animal. The hybrldoma causes 
ascites tumor in 10 to 21 days. 

[0308] The ascltto fluid Is collected from the mice or nude mice, and centrlfuged to remove solid contents at 3000 
rpm for 5 minutes. 

15 [0309] A monoclonal antibody can be purified and isolated from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

[0310] The subclass of the antibody can be determined using a mouse monoclonal antibody typing kit or a rat mon- 
oclonal antibody typing kit. The polypeptide amount can be detemiined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

20 [0311] The antibody obtained in the above Is within the scope of the antibody of the present Invention. 

[0312] The antibody can be used for the general assay using an antibody, such as a radioactive material labeled 
immunoassay (RIA), competitive binding assay, an immunotissue chemical staining method (ABC method, CSA meth- 
od, etc), immunoprecipitation, Western blotting, ELISA assay, and the like {An introduction to Radioimmunoassay and 
Related Techniques^ Elsevier Science (1986); Techniques in Immunocytochemistry, Academic Press, Vol. 1 (1982), 

25 Vol. 2 (1 983) & Vol. 3 (1 985); Practice and Theory of Enzyme Immunoassays, Elsevier Science (1 985); Enzyme-linked 
immunosorbent Assay (ELISA), tgaku Shoin (1 976) ; Antibodies - A Laboratory Manual, Cold Spring Harbor laboratory 
(1988); Monoclonal Antibody Experiment Manual, Kodansha Scientific (1987); Second Series Biochemical Experiment 
Course, Vol. 5, Immunobiochemistry Research Method, Tokyo Kagaku Dojin (1986)). 
[0313] The antibody of the present Invention can be used as It is or after being labeled with a label. 

30 [0314] Examples of the label include radioisotope, an affinity label (e.g., biotin, avidin, or the like), an enzyme label 
(e.g., horseradish peroxidase, alkaline phosphatase, or the like), a fluorescence label (e.g., FITC, rhodamine, or the 
like), a label using a rhodamine atom, (J. Histochem. Cytochem., 18: 315 (1970); Meth. Enzym., 62: 308 (1979); Im- 
munol., 109: 129 (1972); J. Immunol., Meth., 13: 215 (1979)), and the like. 

[031 5] Expression of the polypeptide of the present Invention, fluctuation of the expression, the presence or absence 
35 of structural change of the polypeptide, and the presence or absence in an organism other than coryneform bacteria 
of a polypeptide corresponding to the polypeptide can be analyzed using the antibody or the labeled antibody by the 
above assay, or a polypeptide array or proteome analysis described below. 

[0316] Furthermore, the polypeptide recognized by the antibody can be purified by immunoaffinlty chromatography 
using the antibody of the present invention. 

40 

12. Production and use of polypeptide array 

(1 ) Production of polypeptide array 

<5 [0317] A polypeptide array can be produced using the polypeptide of the present invention obtained in the above 
item 10 or the antibody of the present invention obtained In the above item 11. 

[031 8] The polypeptide array of the present Invention Includes protein chips, and comprises a solid support and the 
polypeptide or antibody of the present invention adhered to the surface of the solid support. 
[0319] Examples of the solid support include plastic such as polycarbonate or the like; an acrylic resin, such as 
50 polyacrylamide or the like; complex carbohydrates, such as agarose, sepharose, or the like; silica; a silica-based nria- 
terial, carbon, a metal, inorganic glass, latex beads, and the like. 

[0320] The polypeptides or antibodies according to the present invention can be adhered to the surface of the solid 
support according to the method described in Biotechniques, 27: 1258-61 (1999); Molecular Medicine Today, 5: 326-7 
(1999); Handbook of Experimental Immunology, 4th edition, Blackwell Scientific Publications, Chapter 10 (1986); Meth. 
55 Enzym., 34 (1 974); Advances in Experimental Medicine and Biology, 42 (1 974); U.S. Patent 4,681 ,870; U.S. Patent 
4,282,287; U.S. Patent 4,762,881, or the like. 

[0321] The analysis described herein can be efficiently perfonried by adhering the polypeptide or antibody of the 
present Invention to the solid support at a high density, though a high fixation density Is not always necessary. 
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(2) Use of polypeptide array 

[0322] A polypeptide or a compound capable of binding to and interacting with the polypeptides of the present in- 
vention adhered to the array can be identified using the polypeptide array to which the polypeptides of the present 
5 invention have been adhered thereto as described in the above (1). 

[0323] Specifically, a polypeptide or a compound capable of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the polypeptides of the present invention to the following steps (i)to (iv): 

(i) preparing a polypeptide array having the polypeptide of the present invention adhered thereto by the method 
10 of the above (1); 

(II) incubating the polypeptide immobilized on the polypeptide array together with at least one of a second polypep- 
tide or compound; 

(III) detecting any complex formed between the at least one of a second polypeptide or compound and the polypep- 
tide immobilized on the an^ay using, for example, a label bound to the at least one of a second polypeptide or 

15 compound, or a secondary label which specifically binds to the complex or to a component of the complex after 
unbound material has been removed; and 
(iv) analyzing the detection data. 

[0324] Specific examples of the polypeptide array to which the polypeptide of the present invention has been adhered 
20 include a polypeptide array containing a solid support to which at least one of a polypeptide containing an amino acid 
sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or added in the amino acid sequence of the polypeptide and having 
substantially the same activity as that of the polypeptide, a polypeptide containing an amino acid sequence having a 
homology of 60% or more with the amino acid sequences of the polypeptide and having substantially the same activity 
25 as that of the polypeptides, a partial fragment polypeptide, and a peptide comprising an amino acid sequence of a part 
of a polypeptide. 

[0325] The amount of production of a polypeptide derived from corynefonn bacteria can be analyzed using a polypep- 
tide array to which the antibody of the present invention has been adhered in the above (1). 
[0326] Specifically, the expression amount of a gene derived from a mutant of coryneform bacteria can be analyzed 
30 by subjecting the gene to the following steps (i) to (iv): 

(i) preparing a polypeptide array by the method of the above (1); 

(ii) incubating the polypeptide an-ay (the first antibody) together with a polypeptide derived from a mutant of co- 
rynefonn bacteria; 

35 (Hi) detecting the polypeptide bound to the polypeptide immobilized on the an-ay using a labeled second antibody 

of the present invention; and 
(iv) analyzing the detection data. 

[0327] Specific examples of the polypeptide array to which the antibody of the present invention is adhered include 
40 a polypeptide array comprising a solid support to which at least one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide comprising an amino 
acid sequence in which at least one amino acids is deleted, replaced, Inserted or added in the amino acid sequence 
of the polypeptide and having substantially the same activity as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with the amino acid sequences of the polypeptide and having 
45 substantially the same activity as that of the polypeptides, a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

[0328] A fluctuation in an expression amount of a specific polypeptide can be monitored using a polypeptide obtained 
In the time course of culture as the polypeptide derived from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 
50 [0329] When a polypeptide derived from a mutant of corynefonn bacteria is used, a mutated polypeptide can be 
detected. 

13. Identification of useful mutation In mutant by proteome analysis 

55 [0330] Usually, the proteome is used herein to refer to a method wherein a polypeptide is separated by twodimen- 
sional electrophoresis and the separated polypeptide is digested with an enzyme, followed by identification of the 
polypeptide using a mass spectrometer (MS) and searching a data base. 

[0331] The two dimensional electrophoresis means an electrophoretic method which Is performed by combining two 
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electrophoretlc procedures having different principles. For example, polypeptides are separated depending on molec- 
ular weight in the primary electrophoresis. Next, the gel is rotated by 90° or 180° and the secondary electrophoresis 
Is carried out depending on isoelectric point. Thus, various separation pattems can be achieved (JiS K 3600 2474). 
[0332] In searching the data base, the amino acid sequence information of the polypeptides of the present invention 
5 and the recording medium of the present invention provide for in the above items 2 and 8 can be used. 

[0333] The proteome analysis of a corynefonn bacterium and its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween. 

[0334] The proteome analysis of a wild type strain of coryneform bacteria and a production strain showing an im- 
proved productivity of a target product makes it possible to efficiently identify a mutation protein which is useful in 

10 breeding for improving the productivity of a target product or a protein of which expression amount is fluctuated. 

[0335] Specifically, a wild type strain of corynefonn bacteria and a lysine-producing strain thereof are each subjected 
to the proteome analysis. Then, a spot increased in the lysine-producing strain, compared with the wild type strain, is 
found and a data base is searched so that a polypeptide showing an increase in yield in accordance with an increase 
in the lysine productivity can be identified. For example, as a result of the proteome analysis on a wild type strain and 

IS a lysine-producing strain, the productivity of the catalase having the amino acid sequence represented by SEQ ID NO: 
3785 is increased in the iysine-producing mutant. 

[0336] As a result that a protein having a high expression level is identified by proteome analysis using the nucleotide 
sequence information and the amino acid sequence infonnation, of the genome of the coryneform bacteria of the 
present invention, and a recording medium storing the sequences, the nucleotide sequence of the gene encoding this 
20 protein and the nucleotide sequence In the upstream thereof can be searched at the same time, and thus, a nucleotide 
sequence having a high expression promoter can be efficiently selected. 

[0337] In the proteome analysis, a spot on thetwo-dimentional electrophoresis gel showing a fluctuation is sometimes 
derived from a modified protein. However, the modified protein can be efficiently identified using the recording medium 
storing the nucleotide sequence information, the amino acid sequence Information, of the genome of coryneform bac- 

25 teria, and the recording medium storing the sequences, according to the present Invention. 

[0338] Moreover, a useful mutation point in a useful mutant can be easily specified by searching a nucleotide se- 
quence (nucleotide sequence of promoters, ORF, or the like) relating to the thus identified protein using a recording 
medium storing the nucleotide sequence information and the amino acid sequence information, of the genome of 
corynefonn bacteria of the present invention, and a recording medium storing the sequences and using a primer de- 

30 'signed on the basis of the detected nucleotide sequence. As a result that the useful mutation point Is specified, an 
Industrially useful mutant having the useful mutation or other useful mutation derived therefrom can be easily bred. 
[0339] The present invention will be explained in detail below based on Examples. However, the present invention 
is not limited thereto. 

35 Example 1 

Determination of the full nucleotide sequence of genome of Corynebacterium glutamicum 

[0340] The full nucleotide sequence of the genome of Corynebacterium glutamicum was detennined based on the 
40 whole genome shotgun method (Science, 269. 496-512 (1995)). In this method, a genome library was prepared and 
the temninal sequences were determined at random. Subsequently, these sequences were llgated on a computer to 
cover the full genome. Specifically, the following procedure was carried out. 

(1) Preparation of genome DNA of Corynebacterium glutamicum PJCC 13032 

45 

[0341] Corynebacterium glutamicum ATCC 1 3032 was cultured in BY medium (7 g/l meat extract, 1 0 g/l peptone. 3 
g/l sodium chloride, 5 g/l yeast extract, pH 7.2) containing 1 % of glycine at 30^0 overnight and the cells were collected 
by centrifugation. After washing with STE buffer (10.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/I EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer containing 10 mg/ml lysozyme, followed by gently shaking at 

50 37«c for 1 hour. Then, 2 ml of 1 0% SDS was added thereto to lyse the cells, and the resultant mixture was maintained 
at aS'C for 1 0 minutes and then cooled to room temperature. Then, 1 0 ml of Trls-neutralized phenol was added thereto, 
followed by gently shaking at room temperature for 30 minutes and centrifugation (1 5,000 x g, 20 minutes, 20**C). The 
aqueous layer was separated and subjected to extraction with phenol/chloroform and extraction with chloroform (twice) 
in the same manner. To the aqueous layer, 3 mol/1 sodium acetate solution (pH 5.2) and isopropanoi were added at 

55 1/10 times volume and twice volume, respectively, followed by gently stirring to precipitate the genome DNA. The 
genome DNA was dissolved again in 3 ml of TE buffer (1 0 mmol/l Tris hydrochloride, 1 mmol/l EDTA, pH 8.0) containing 
0.02 mg/ml of RNase and maintained at 37^*0 for 45 minutes. The extractions with phenol, phenol/chlorofomi and 
chlorofomn were carried out successively in the same manner as the above. The genome DNA was subjected to iso- 
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propanol precipitation. The thus fomned genome DNA precipitate was washed with 70% ethanol three times, foilowed 
by air-drying, and dissolved in 1 .25 ml of TE buffer to give a genome DNA solution (concentration: 0.1 mg/ml). 

(2) Construction of a shotgun library 

5 

[0342] TE buffer was added to 0.01 mg of the thus prepared genome DNA of Corynebacterium glutamicum ATCC 
13032 to give a total volume of 0.4 ml, and the mixture was treated with a sonlcator (Yamato Powersonic Model 150) 
. at an output of 20 continuously for 5 seconds to obtain fragments of 1 to 10 kb. The genome fragments were blunt- 
ended using a DNA blunting kit (manufactured by Takara Shuzo) and then fractionated by 6% polyacrylamide gel 

10 electrophoresis. Genome fragments of 1 to 2 kb were cut out from the gel, and 0.3 ml MG elution buffer (0.5 mol/l 
ammonium acetate, 1 0 mmol/l magnesium acetate, 1 mmol/l EDTA, 0.1% SDS) was added thereto, followed by shaking 
at 37**C overnight to elute DNA. The DNA eluate was treated with phenol/chloroform, and then precipitated with ethanol 
to obtain a genome library insert. The total insert and 500 ng of pUCI 8 Smal/BAP (manufactured by Amersham Phar- 
macia Biotech) were ligated at 1 e^'C for 40 hours. 

15 [0343] The ligation product was precipitated with ethanol and dissolved in 0.01 ml of TE buffer. The ligation solution 
(0.001 ml) was Introduced into 0.04 ml of E. coli ELECTRO MAX DH1 OB (manufactured by Life Technologies) by the 
electropo ration under conditions according to the manufacture's instructions. The mixture was spread on LB plate 
medium (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/l sodium chloride, pH 7.0) containing 1 ,6% of agar) 
containing 0.1 mg/ml amplcillln, 0.1 mg/ml X-gal and 1 mmol/l isopropyl-p-D-thiogalactopyranoside (IPTG) and cultured 

20 at 37'C overnight. 

[0344] The transformant obtained from colonies formed on the plate medium was stationarily cultured in a 96-well 
titer plate having 0.05 ml of LB medium containing 0.1 mg/ml ampiciilin at37**C overnight. Then, 0.05 ml of LB medium 
containing 20% glycerol was added thereto, followed by stirring to obtain a. glycerol stock. 

25 (3) Construction of cosmid library 

[0345] About 0.1 mg of the genome DNA of Corynebacterium glutamicum ATCC 13032 was partially digested with 
Sau3A\ (manufactured by Takara Shuzo) and then ultracentrifuged (26,000 rpm, 18 hours, 20''C) under 10 to 40% 
sucrose density gradient obtained using 10% and 40% sucrose buffers (1 mol/l NaCI, 20 mmol/l Tris hydrochloride, 5 

30 mmol/l EDTA, 10% or 40% sucrose, pH 8,0). After the centrlfugatlon, the solution thus separated was fractionated into 
tubes at 1 ml In each tube. After confimning the DNA fragment length of each fraction by agarose gel electrophoresis, 
a fraction containing a large amount of DNA fragment of about 40 kb was precipitated with ethanol. 
[0346] The DNA fragment was ligated to the Bam\-\\ site of superCosI (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The ligation product was incorporated into Escherichia co//XL-1-BlueMR strain 

35 (manufactured by Stratagene) using Gigapack III Gold Packaging Extract (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The Escherichia coll was spread on LB plate medium containing 0.1 mg/ml amp- 
lcillln and cultured therein at 37*C overnight to isolate colonies. The resulting colonies were stationarily cultured at 
37*C overnight in a 96-well titer plate containing 0.05 ml of the LB medium containing 0.1 mg/ml ampiciilin in each 
well. LB medium containing 20% glycerol (0.05 mi) was added thereto, followed by stirring to obtain a glycerol stock. 

40 

(4) Determination of nucleotide sequence 
(4-1 ) Preparation of template 

45 [0347] The full nucleotide sequence of Coryr)ebacterium glutamicum ATCC 1 3032 was determined mainly based on 
the whole genome shotgun method. The template used In the whole genome shotgun method was prepared by the 
PCR method using the library prepared In the above (2). 

[0348] Specifically, the clone derived from the whole genome shotgun library was inoculated using a replicator (man- 
ufactured by GENETIX) Into each well of a 96-well plate containing the LB medium containing 0.1 mg/ml of ampiciilin 

50 at 0.08 ml per each well and then stationarily cultured at 37^C overnight. 

[0349] Next, the culturing solution was transported using a copy plate (manufactured by Tokken) into a 96-well re- 
action plate (manufactured by PE Biosystems) containing a PCR reaction solution (TaKaRa ExTaq (manufactured by 
Takara Shuzo)) at 0.08 ml per each well. Then, PCR was candled out In accordance with the protocol by Makino etal. 
{DNA Research, 5: 1-9 (1998)) using GeneAmp PCR System 9700 (manufactured by PE Biosystems) to amplify the 

55 inserted fragment. 

[0350] The excessive primers and nucleotides were eliminated using a kit for purifying a PCR production (manufac- 
tured by Amersham Pharmacia Biotech) and the residue was used as the template in the sequencing reaction. 
[0351] Some nucleotide sequences were detemiined using a double-stranded DNA plasmid as a template. 
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[0352] The double-stranded DNA plasmid as the template was obtained by the following method. 
[0353] The clone derived from the whole genome shotgun library was inoculated into a 24- or 96-weii plate containing 
a 2x YT medium (1 6 g/l bactotrypton, 1 0 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) containing 0.05 mg/ml amplcillln 
at 1 .5 ml per each well and then cultured under shaking at 37^C overnight. 
5 [0354] The double-stranded DNA plasmid was prepared from the culturing solution using an automatic plasmid pre- 
paring machine, KURABO PI-50 (manufactured by Kurabo industries) or a multiscreen (manufactured by Mlllipore) In 
accordance with the protocol provided by the manufacturer 

[0355] To purify the double-stranded DNA plasmid using the multiscreen, Biomek 2000 (manufactured by Beckman 
Coulter) or the like was employed. 
10 [0356] The thus obtained double-stranded DNA plasmid was dissolved In water to give a concentration of about 0.1 
mg/ml and used as the template In sequencing. 

(4-2) Sequencing reaction 

IS [0357] To 6 ^1 of a solution of ABI PRISM BigDye Temriinator Cycle Sequencing Ready Reaction Kit (manufactured 

by PE Biosystems), an IVIIS regular direction primer (M13-21) or an M1 3 reverse direction primer (M13REV) {DNA 
Research, 5: 1-9 (1998) and the template prepared in the above (4-1) (the PGR product or the plasmid) were added 
to give 10 ^il of a sequencing reaction solution. The primers and the templates were used In an amount of 1 .6 pmol 
and an amount of 50 to 200 ng, respectively. 

20 [0358] Dye terminator sequencing reaction of 45 cycles was carried out with GeneAmp PCR System 9700 (manu- 
factured by PE Biosystems) using the reaction solution. The cycle parameter was determined in accordance with the 
manufacturer's Instruction accompanying ABI PRISM BigDye Temriinator Cycle Sequencing Ready Reaction Kit. The 
sample was purified using Multiscreen HV plate (manufactured by Millipore) according to the manufacture's instruc- 
tions. The thus purified reaction product was precipitated with ethanol, followed by drying, and then stored In the dark 

25 at-SO^C. 

[0359] The dry reaction product was analyzed by ABI PRISM 377 DNA Sequencer and ABI PRISM 3700 DNA An- 
alyzer (both manufactured by PE Biosystems) each In accordance with the manufacture's instructions. 
[0360] The data of about 50,000 sequences in total (I.e., about 42,000 sequences obtained using 377 DNA Sequenc- 
er and about 8,000 reactions obtained by 3700 DNA Analyser) were transferred to a server (Alpha Server 41 00: man- 
30 ufactured by COMPAQ) and stored. The data of these about 50,000 sequences con^esponded to 6 times as much as 
the genome size. 

(5) Assembly 

35 [0361] All operations were earned out on the basis of UNIX platfonn. The analytical data were output in Macintosh 
platform using X Window System. The base call was carried out using phred (The University of Washington). The 
vector sequence data was deleted using SPS Cross.Match (manufactured by Southwest Parallel Software). The as- 
sembly was carried out using SPS phrap (manufactured by Southwest Parallel Software; a high-speed version of phrap 
(The University of Washington)). The conlig obtained by the assembly was analyzed using a graphical editor, consed 

"^0 (The University of Washington). A series of the operations from the base call to the assembly were carried out simul- 
taneously using a script phredPhrap attached to consed. 

(6) Determination of nucleotide sequence in gap part 

45 [0362] Each cosmid in the cosmid library constructed in the above (3) was prepared by a method similar to the 
preparation of the double-stranded DNA plasmid described in the above (4-1 ). The nucleotide sequence at the end of 
the Inserted fragment of the cosmid was detemnined by using ABI PRISM BigDye Temfiinator Cycle Sequencing Ready 
Reaction Kit (manufactured by PE Biosystems) according to the manufacture's instructions. 

[0363] About 800 cosmid clones were sequenced at both ends to search a nucleotide sequence in the contig derived 
so from the shotgun sequencing obtained in the above (5) coincident with the sequence. Thus, the linkage between re- 
spective cosmid clones and respective contigs were detennined and mutual alignment was carried out. Furthennore, 
the results were compared with the physical map of Corynebacterium glutamlcum ATCC 13032 (Mol. Gen. Genet, 
252: 255-265 (1 996) to carrying out mapping between the cosmids and the contigs. 

[0364] The sequence in the region which was not covered with the contigs was detemnined by the following method. 
55 [0365] Clones containing sequences positioned at the ends of contigs were selected. Among these clones, about 
1 ,000 clones wherein only one end of the inserted fragment had been determined were selected and the sequence at 
the opposite end of the inserted fragment was detennined. A shotgun library clone or a cosmid clone containing the 
sequences at the respective ends of the Inserted fragment in two contigs was identified, the full nucleotide sequence 
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of the inserted fragment of this clone was determined, and thus the nucleotide sequence of the gap part was determined. 
When no shotgun library clone or cosmid clone covering the gap part was available, primers complementary to the 
end sequences at the two contigs were prepared and the DNA fragment in the gap part was amplified by PGR. Then, 
sequencing was perfonned by the primer walking method using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment was determined. 
Thus, the nucleotide sequence of the domain was determined. 

[0366] In a region showing a low sequence precision, primers were synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus detennined full nucleotide sequence of the genome of 
Corynebacterium gtutamicum P^OO 13032 strain is shown in SEQ ID N0:1 . 

(7) Identification of ORF and presumption of its function 

[0367] ORFs in the nucleotide sequence represented by SEQ ID N0:1 were identified according to the following 
method. First, the ORF regions were detennined using software for identifying ORF, I.e., Glimmer, GeneMark and 

GeneMark.hmm on UNIX platform according to the respective manual attached to the software. 
[0368] Based on the data thus obtained, ORFs in the nucleotide sequence represented by SEQ ID N0:1 were iden- 
tified. 

[0369] The putative function of an ORF was detennined by searching the homology of the identified amino acid 
sequence of the ORF against an amino acid database consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein encoding domains derived from GenBank database, Frame 
Search {manufactured by Compugen), or by searching the homology of the identified amino acid sequence of the ORF 
against an amino acid database consisting of protein-encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived from GenBank database, BLAST. The nucleotide sequences 
of the thus detennined ORFs are shown In SEQ ID N0S:2 to 3501 , and the amino acid sequences encoded by these 
ORFs are shown in SEQ ID NOS:3502 to 7001. 

[0370] In some cases of the sequence listings in the present invention, nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an initiating codon encoding Met. 

[0371] Also, the pretended nucleotide sequences are SEQ ID N0S:2 to 355 and 357 to 3501 , and the preferred amino 
acid sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 7001 

[0372] Table 1 shows the registration numbers in the above-described databases of sequences which were judged 
as having the highest homology with the nucleotide sequences of the ORFs as the results of the homology search in 
the amino acid sequences using the homotogy-searching software Frame Search (manufactured by Compugen), 
names of the genes of these sequences, the functions of the genes, and the matched length, identities and analogies 
compared with publicly known amino acid translation sequences. Moreover, the corresponding positions were con- 
firmed via the alignment of the nucleotide sequence of an arbitrary ORF with the nucleotide sequence of SEQ ID NO: 
1 . Also, the positions of nucleotide sequences other than the ORFs (for example, ribosomal RNA genes, transfer RNA 
genes, IS sequences, and the like) on the genome were detennined. 

[0373] Fig. 1 shows the positions of typical genes of the Corynebacterium glutamicum ATCC 1 3032 on the genome. 
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ferric enterobactin transport system 
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hypothetical protein 
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magnesium and cobalt transport 
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hypothetical protein 


biotin synthase 
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urease accessory protein 


urease accessory protein 


urease accessory protein 
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epoxide hydrolase 
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probable electron transfer protein 
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aspartate transaminase 




DNA polymerase III holoenzyme tau 
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hypothetical protein 


recombination protein 
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Function 


class A penicillin-binding 
protein(PBPI) 


regulatory protein 




hypothetical protein 


transcriptional regulator 


shikimate transport protein 




long-chain-fatty-acid— CoA ligase 


transcriptional regulator 


3-oxoacyK acyl -ca rri er-protei n) 
reductase 


gluiamine synthetase 


Short-chain acyl CoA oxidase 


nodulation protein | 


hydrolase 






cAMP receptor protein 




ultraviolet N-glycosylase/AP lyase 


cytochrome c biogenesis protein 
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Homologous gene 


Mycobacterium leprae pon1 


Streptomyces coelicolor A3(2) 
whiB 




Streptomyces coelicolor A3(2) 
SCH17.10C 


Mycobacterium tuberculosis 
H37RV Rv3678c 


Escherichia coti K12 shiA 




Bacillus subtilis IcfA 


streptomyces coelicolor A3(2) 
SCJ4.28C 


Bacillus subtilis fabG 


Emericella nidulans fluG 


1 Arabidopsis thaliana atgS 


1 Rhizobium leguminosarum nod 


Mycobacterium tuberculosis 
H37RV Rv3677c 






1 Vibrio cholerae crp 




Micrococcus luteus pdg 


Mycobacterium tuberculosis 
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Function 


hypothetical protein 


serine proteinase 


epoxide hydrolase 


hypothetical membrane protein 


phosphoserine phosphatase 


hypothetical protein 


conjugal transfer region protein 




hypothetical membrane protein 


hypothetical protein 


hypothetical protein 








ATP-dependent RNA helicase 


cold shock protein 




DNA topoisomerase \ 
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65.5 
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56.5 
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84.8 








66.1 
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46.8 


29.6 


35.0 


32.9 
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1 33.8 
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61.7 




Homologous gene 


Escherichia coli K12 yeaB 


Mycobacterium tuberculosis 
H37RV Rv367ic 


Corynebacterium sp. C12 cEH 


Mycobacterium tuberculosis 
H37RV Rv3669 


Mycobacterium leprae 
MTCY20G9.32C. serB 


Mycobacterium tuberculosis 
H37RV Rv3660c 


Escherichia coll trbB 




Mycobacterium tuberculosis 
H37RV Rv3658c 


Mycobacterium tuberculosis 
H37RV Rv3657c 


Mycobacterium tuberculosis 
H37RV Rv3656c 








Bacillus subtilis yprA 


Arthrobacter globiformis SI55 
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H37RV Rv3646c topA 
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Function 


adenylate cyclase 


ONA polymerase Hi subunit 
tau/gamma 




hypothetical protein 


hypothetical protein 


ribosomal large subunit 
pseudouridine synthase C 


beta-glucosidase/xylosidase 


beta-glucosidase 


NAD/mycothiol-dependent 
formaldehyde dehydrogenase 




metallo-beta-lactamase superfamily 
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to B 
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X a 
0 -D 

C^ 


valanimycin resistant protein 


dTDP-glucose 4.6-dehydratase 


hypothetical protein 


dolichol phosphate mannose 
1 synthase 




nucleotide sugar synthetase 


UDP-sugar hydrolase 
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Table 1 (continued) 


Homologous gene 


Stigmatella aurantiaca B17R20 
cyaB 


Bacillus subtilis dnaX 




Ureaplasma urealyticum uu033 


Deinococcus radiodurans 
DR0202 


Escherichia coli K12 rluC 


Erwinta chrysanthemi D1 bgxA 


Azospirillum irakense salB 


Amycotatopsis methanolica 




Rhodococcus erythropolis ortS 


Escherichia coli K12fabG 


Streptomyces viridifaciens vimf 


Actinoplanes sp. acbB 


Mycobacterium tuberculosis 
H37RV Rv3632 


Methanococcus jannaschil JAL- 
1 MJ1222 




Escherichia coli K12 yefJ 


Salmonella typhtmurium ushA 
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Function 




NADP-dependent alcohol 
dehydrogenase 


g!ucose-1 -phosphate 
thymidylyltransferase 


dTDP-4-keto-L-rhamnose reductase 


dTDP-glucose 4.6-dehydratase 


NADH dehydrogenase 


Fe-regulated protein 




hypothetical membrane protein 


metallopeptidase 


prolyl endopeptidase | 




hypothetical membrane protein 


! cell surface layer protein 


autophosphorylating protein Tyr 
kinase 


protein phosphatase | 




capsular polysaccharide 
biosynthesis 


ORF 3 


lipopolysaccharide biosynthesis / 
aminotransferase 
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Table 1 (continued) 


Homologous gene 




Mycobacterium tuberculosis 
H37RV adhC 


Salmonella anatum M32 ffbA 


Streptococcus mutans rmIC 


Streptococcus mutans XC rmlB 


Thermus aquaticus HB8 nox 


Staphylococcus aureus sirA 




Mycobacterium tuberculosis 
H37RV RV3630 


Streptomyces coelicolor 
SC5F2A.19C 


Sphingomonas capsulata 




Streptomyces coelicolor A3(2) 


Co ryne bacterium 
ammoniagenes ATCC 6872 


Acinetobacterjohnsonii ptk 


Acinetobacterjohnsonii ptp | 




Staphylococcus aureus M capD 


Vibrio cholerae 


Campylobacter jejuni wlaK 
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Function 


pilin glycosylation protein 


capsular polysaccharide 
biosynthesis 


lipopolysaccharide biosynthesis / 
export protein 


UDP-N-acetylglucosamine 1- 
carboxyvinyltransferase 


UDP-N- 

acetylenolpyruvoylglucosamine 
reductase 


sugar transferase 


transposase 




transposase (insertion sequence 
(S31831) 




hypothetical protein 


acetyltransferase 


hypothetical protein 8 


UDP-glucose 6-dehydrogenase 






glycosyl transferase 


acetyltransferase 




Matched 
length 
(a.a) 
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Similarity 
(%) 


75.0 


69.2 


69.8 


64.6 


68.5 


57.3 


79.3 




94.3 




57.4 

1 


60.2 


53.0 


89.7 






65.0 


62.0 




Identity : 
(%) 1 


54.6 i 


33.4 


34.3 


31.4 


34.8 


32.0 
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Function 


glycosyl transferase 


malonyl-CoA-decarboxylase 


hypothetical membrane protein 


ketoglutarate semialdehyde 
dehydrogenase 


5-dehydro-4-deoxyglucarate 
dehydratase 


I als operon regulatory protein 


hypothetical protein 
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low-affinity inorganic phosphate 
transporter 






naphthoale synthase 1 


peptidase E 
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Rhizobium trifolii matB 
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Function 


succinate-semialdehyde 
dehydrogenase (NAD(P)+) 


novel two-component regulatory 
system 


tyrosine-specific transport protein 


cation-transporting ATPase G 


hypothetical protein or 
dehydrogenase 




503 ribosomal protein L10 


SOS ribosomal protein L7/L12 




hypothetical membrane protein 


DNA-directed RNA polymerase beta 
chain 


DNA-directed RNA polymerase beta 
chain 


hypothetical protein 




DNA-binding protein 


hypothetical protein 
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Table 1 (continued; 


Homologous gene 


Escherichia coli K12 gabD 


AzospiriHum brasilense carR 


Escherichia coli K12 o341#7 
tyrP 


Mycobacterium tuberculosis 
H37Rv RV1992C ctpG 


Streptomyces lividans P49 




Streptomyces griseus N2-3-1 
rpU 


Mycobacterium tuberculosis 
H37RV RV0652 rpIL 




' Mycobacterium tuberculosis 
H37RV Rv0227c 


Mycobacterium tuberculosis 
H37RV RV0667 rpoB 


Mycobacterium tuberculosis 
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Function 


30S ribosomal protein S12 


30S ribosomal protein 87 


elongation factor G 






lipoprotein ] 






ferric enterobactin transport ATP- 
binding protein 


ferric enterobactin transport protein 


ferric enterobactin transport protein 


butyryl -Co A: acetate coenzyme A 
transferase 


308 ribosomal protein 810 


508 ribosomal protein L3 




50S ribosomal protein L4 | 


508 ribosomal protein L23 




508 ribosomal protein L2 


308 ribosomal protein Si 9 
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SOS ribosomal protein L22 


30S ribosomal protein S3 


1 50S ribosomal protein Li 6 


I SOS ribosomal protein L29 | 


30S ribosomal protein S 1 7 | 








SOS ribosomal protein L14 


SOS ribosomal protein L24 


j 503 ribosomal protein L5 | 




2.5-diketo-D-gluconic acid reductase 




formate dehydrogenase chairs D 


molybdopterin-guanine dinucleotide 
biosynthesis protein 


formate dehydrogenase H or alpha 
chain 






ABC transporter ATP-binding protein 
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i Function 


transcriptional repressor 


adenylate kinase 




1 methionine aminopeptidase 1 




translation initiation factor IF-1 


30S ribosomal protein SI 3 


30S ribosomal protein S11 


30S ribosomal protein 34 


RNA polymerase alpha subunit 




5CS ribosomal protein LI 7 


pseudouridylate synthase A 


hypothetical membrane protein 






hypothetical protein 


cell elongation protein 


cydopropane-fatty-acyl-phospholipid 
synthase 


hypothetical membrane protein 
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SC5H1.34 


Corynebacterium diphtheriae 
irp1 


Mycobacterium tuberculosis 
H37RV Rv3366 spoU 


Mycobacterium tuberculosis 
H37RV Rv3356c folD 


Mycobacterium leprae 
MLCB1779.16C 


Streptomyces coelicolor A3(2) 
SC66T3.18C 




Corynebacterium glutamicum 
metA 


Leptospira meyeri melY 


Escherichia coll K12 cstA 




Escherichia coli K12 yjiX 
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1 

UJ 

u 

a. 
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00 
00 

o 

O 
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gp:SCG8A_5 
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T- 
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o 
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CD 
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o 

Ol 
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o 
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Tenminal 
(nt) 


656534 


655097 


657215 


657205 


658142 


658928 


659424 


660538 


660650 


662017 


562374 


562382 


564126 


665183 


666460 


670465 


669445 


8 

CD 


571045 


initial 
(nt) 


655122 


i 655834 


656547 


658002 


658005 


658155 


653933 


659543 
661120 


661166 


652120 


663761 1 


890999 


666313 


: 667770 


1 668264 


670053 


670472 


671653 


So « 
w 2 ^ 
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4214 


4215 


4216 


4217 


4218 


4219 


o 
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OJ 
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4222 


4223 
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CN 
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! 
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4227 
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CN 
OJ 
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NO. 
(DNA) 




ro 




in 


<o 


717 
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Function 


hypothetical protein 


carboxy phosphoenofpyruvate 
mutase 


Citrate synthase 




hypothetical protein 




L-malate dehydrogenase 


regulatory protein 




vibriobactin utilization protein 


ABC transporter ATP-binding protein 


ABC transporter 

1 


ABC transporter 


iron-regulated lipoprotein precursor 


chloramphenicol resistance protein 1 


catabolite repression control protein 


hypothetical protein 




15 


Matched 
length 
(a.a) 


ro 


oo 
<N 


o 

00 

ro 




ro 
to 




00 

ro 

CO 


(O 
CN 
CN 


1 


GO 
fN 


C7) 
(O 
CN 


ai 

CO 
CO 


o 

CO 
CO 


CO 

in 

CO 


in 

05 
CO 


ro 
O 
ro 


CN 




20 


Similarity 
(%) 


86.4 


76.2 


81.3 




62.3 




67.5 


62.8 




54.2 


85.1 


86.4 


88.2 


82.3 


to 

(O 


58.1 1 


85.8 






Identity 
(%) 


71.0 




56.1 




9 

ro 




37.6 


<D 
CN 




25.4 


in 
m 


56.3 


63.0 


53.1 


32.2 1 


30.4 


56.2 


t 


25 ^ 

3 
C 

c 
o 
u 

30 ^ 

03 

35 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv1130 


j Streptomyces hygroscopicus 


Mycobacterium smegmatis 
ATCC 607 gItA 




Escherichia coli K12 yneC 




Methanothermus fervidus V2AS 
mdh 


Bacillus stearothermophilus T-6 




Vibrio cholerae OGAWA 395 
viuB 


Corynebacterium diphtheriae 
irp1D 


Corynebacterium diphtheriae 
irpIC 


Corynebacterium diphtheriae 
IrplB 


Corynebacterium diphtheriae 
irpi 


Streptomyces venezuefae cmlv 


Pseudomonas aeruginosa crc 


Haemophilus influenzae Rd 
H!1240 




40 


db Match 


pir:C70539 


prf: 1902224 A 
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sp:YNEC_ECOLI 
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o 
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675082 


676218 


h- 
"V 
o 
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(D 


680131 


681040 


681846 


682871 


683876 


686380 


687346 


688007 


688335 1 


50 


Initial 
(nt) 


671700 


672665 


673608 


673639 


674990 


675175 


676122 


676937 


677748 


681027 


681846 


682904 


683866 


684925 


685109 


686435 


687351 


688141 




SEQ 
NO. 
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CN 
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CO 
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Function 




ferrichrome ABC transporter 


1 hemin permease 


tryptophanyl-tRNA synthetase 


1 hypothetical protein 




penicillin-binding protein 6B 
precursor 


■ 

hypothetical protein 


hypothetical protein 






uracil phcsphoribosyltransferase 


bacterial regulatory protein, lad 
family 


N-acyl-L-amino acid amidohydrolase 
or peptidase 


phosphomannomutase 


dihydrolipoamide dehydrogenase 


pyruvate carboxylase 


hypothetical protein 


hypothetical protein 


15 


Matched 
length 
(aa) 




CNJ 


CO 


CO 
CO 


00 




o 

CO 
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CN 
CO 
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o 
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CO 
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00 

CO 
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CO 
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CO 
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in 
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CD 
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CO 

cn 
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CO 




in 


CN 
CN 


CO 


CD 
CN 


d 

CO 


25 

T3 
0) 
3 
C 

o 
o 

30 
35 
40 


Homologous gene 




Corynebacterium diphtheriae 
hmuV 


Yersinia enlerocolitica hemU 


Escherichia coli K12 trpS 


Escherichia coli K12 yhjO 




Salmonella typhimurium LT2 
dacD 


Mycobacterium tuberculosis 
H37RV Rv331 1 


Streptomyces coelicolor A3(2) 
SC6G1 0.08c 






Lactococcus ladis upp 


Streptomyces coelicolor A3(2) 
SC1A2.11 


Mycobacterium tuberculosis 
H37RV Rv3305c amiA 


Mycoplasma pirum BER manB 


Halobaderium volcanii ATCC 
29605 Ipd 


Corynebacterium glutamicum 
strain21253 pyc 


Mycobacterium tuberculosis 
H37RvRv1324 


Streptomyces coelicolor A3(2) 
SCF11.30 


db Match 
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CN 

<o 

O) 

o 


S54438 
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LU 

1 

Q 
-o 
I 
> 




> 

< 

CO 
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a 
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Q. 
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CO 
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o 

O) 
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CO 

in 

CO 
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a> 


in 

fO 


fO 
CO 
<D 


00 
CO 


1182 


1725 


1407 


3420 


o 
CO 


CO 
CO 
TT 


45 


Terminal 
(nt) 


688916 


689917 


690706 


692916 


694110 


695074 


695077 


696769 


698065 


699266 


698922 


699913 


700381 \ 


703262 


700384 


704811 


708630 


709708 


710278 


50 


Initial 
(nt) 


689890 


690696 


691722 


691882 


693028 


i 694172 


696213 


697995 


698922 


699072 


1 699272 


699281 


699998 


702081 1 


702108 


703405 


705211 


708839 


709793 
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"V 
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1 ^ 
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CO 
CN 
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to 

CN 


4265 
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CN 
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NO 
(DNAl 
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Function 


rotein 


j uctase 


or propionate 


>hoeno]pyruvate 


rotein 


citrate synthase 




rotein 






'urtransferase 


'Otein 


c 

B 
o 


lembrane protein 


Otein 


otein 


detergent sensitivity rescuer or 
carboxyl transferase 


detergent sensitivity rescuer or 
carboxyl transferase 




hypothetical pi 


thioredoxin rec 


PrpD protein 1 
catabolism 


carboxy phosp 
mutase 


hypothetical pi 




hypothetical pi 






thiosulfate sull 


hypothetical pi 


hypothetical pi 


hypothetical m 


hypothetical pi 


hypothetical pf 


Matched 
length 
(a.a) 1 


00 


lO 
CO 


CN 

m 


00 
CN 


to 
a) 


ro 

CO 
CO 




CD 

in 






m 

CM 
Cvl 


CN 

in 
CO 


CO 
CO 


00 


CM 

cn 


CO 
CO 


CO 
LO 


ro 
in 


>> 






































Similari 
{%) 


69.0 


59.3 


49.5 


74.5 


47.0 


78.9 




72.6 






100.0 


79.8 


76.7 


63.4 


66.2 


00 

<x> 

CO 


100.0 


100.0 


Identity 
{%) 


44.6 


24.6 


24.0 


42.5 


39.0 


to 

CO 




CO 

d 






OOOL 


61.1 


51.1 


35.1 


31.8 


33,3 


99.8 


99.6 


Homologous gene 


Bacillus subtilis 168 yciC 


Bacillus subtilis IS58 tncB 


Salmonella typhimurium LT2 
prpD 


Streptomyces hygroscopicus 


Aeropyrum pernix K1 APE0223 


Mycobacterium smegmatis 
ATCC 607 gItA 




Mycobacterium tuberculosis 
H37RV Rv1129c 






Corynebacterium glutamicum 
ATCC 13032 thtR 


Campylobacter jejuni Cj0069 


Mycobacterium leprae 
iMLCB4.27c 


Mycobacterium tuberculosis 
H37RvRv1565c 


Escherichia coli K12 yceF | 


Mycobacterium leprae B130B- 
C3-211 


Corynebacterium glutamicum 
AJ11060 dtsR2 


Corynebacterium glutamicum 
AJ11060 dtsRI 
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o> 

CD 
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cn 
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CM 
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to 

CN 
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Terminal 
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712647 i 
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715145 


714380 i 
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720016 


720547 


722841 


722925 
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726470 


726742 


728696 
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o 

(D 


711724 


712738 


714258 


714757 
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716660 


718009 


718105 


718558 


721449 


721777 


723338 


723412 
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726715 


728352 


CM 

CO 

o 

ro 
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00 

to 
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o 
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CN 
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CO 
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o 
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CN 
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CO 
CN 


CM 

oo 
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CO 
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CO 
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Function 


bifundional protein (biotin synthesis 
repressor and biotin acetyUCoA 
carboxylase ligase) 


hypothetical membrane protein 


5'-phosphoribosyl-5-amino-4- 
Imidasol carboxylase 


c 

Qi 

o 

Q- 

TO 
D. 

-¥ 
id 






5-phosphoribosyl-5-amino-4- 
imidasol carboxylase 


hypothetical protein 


hypothetical protein 


nitrilotriacetate monooxygenase 


transposase (ISA0963-5) 


glucose 1-dehydrogenase 


hypothetical membrane protein 




hypothetical protein 


hypothetical protein 




15 




Matched 
length 
(aa) 


cn 

O) 
CN 


m 

CO 


cn 


OJ 
CN 
CO 








CN 


in 
in 

CM 


(0 
CN 


cn 
o 
m 


CO 

m 

CN 


(0 
O) 




in 


CN 

%r 
1— 




20 




Similarity 
(%) 


61.8 


58.8 


83.8 


73.6 






93.2 


60.5 


70.6 


73.0 


52.5 


64.8 


68.8 




66.3 


00 

co' 








Identity 
(%) 


28.7 


23.0 


69.0 


41.1 






85.7 


36.2 


CD 
CN 


CM 
CO 


23.4 


31.3 


29.2 




28.6 


35.9 




25 
30 
35 
40 


Table 1 (continued) 


Homologous gene 


Escherichia coli K12 birA 


Mycobacterium tuberculosis 
H37RV Rv3278c 


Corynebacterlum 
ammoniagenes ATCC 6872 
purK 


Escherichia coli K12 kup 






Corynebacterlum 
ammoniagenes ATCC 6872 
purE 


Actinosynnema pretiosum 


Streptomyces coelicolor A3(2) 
SCF43A.36 


Chelatobacter helntzil ATCC 
29600 ntaA 


Archaeoglobus fulgidus 


Bacillus megaterium 1AM 1030 
gdhll 


Thermotoga maritima MSB8 
TM1408 




Bacillus subtilis leSywjB 


Streptomyces coelicolor A3(2) 
SCJ9A.21 
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CM 
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o 
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CM 
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733017 


734943 


733183 


735340 
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737216 


738673 
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741765 


742195 


741818 


742828 
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730436 




731312 
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733072 


733797 


734984 


735402 


735899 
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738529 
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741016 
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00 
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o 

CN 
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00 
CN 


4287 


4288 


4289 


4290 


4291 


CN 

CN 
^ 


cn 

CM 

^ 


4294 


4295 


CD 

o> 

CN 


4297 


00 
CJ) 
CN 
^ 


CJ> 
CJ» 
(N 


o 
o 

CO 


o 
cn 


4302 
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(DNA) 
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Function 


trehalose/maltose-binding protein 


trehalose/maltose-binding protein 




trehalose/maltose-binding protein 




ABC transporter ATP-binding protein 
(ABC-type sugar transport protein) 
or cellobiose/mallose transport 
protein 




RNA helicase 






hypothetical protein 


hypothetical protein 


DNA helicase (1 










RNA helicase 


hypothetical protein 


RNA polymerase assodated protein 
(ATP-dependent helicase) 


Matched 
length 
(aa) 




(O 

o 

CO 




p- 




CM 
CO 

<o 




1783 






o 

CN 


o 

CN 


o 










2033 


CO 
O) 
CO 


ro 
p- 
CO 


Similarity 
(%) 


75.3 


1 70.3 




62.4 




73.9 




cn 
ai 






59.2 


62.5 












oo 
in 


53.2 


to 

00 


Identity 


42.4 


37.3 




d 

CO 




57.2 




25.1 






31.7 


30.0 


20.7 










22.4 


24.4 


CO 
C\ 


Homologous gene 


Thermococcus litoralis malG 


Thermococcus litoralis malF | 




j Thermococcus litoralis malE 




Streptomyces reliculi msiK 




Detnococcus radiodurans R1 
DRB0135 






Mycobacterium tuberculosis 
H37RV Rv3268 


Helicobacter pylori J99 jhp0462 


Escherichia coli K12 uvrD 










Streptomyces coelicolor 
SCH5.13 


Halobacterium sp. NRC-1 
plasmid pNRC100H1130 


Escherichia coli K12 hepA 


db Match 


u 

kD 
LD 
CO 
CD 
O 
^ 
CN 
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CO 

CO 
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CN 
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prf:2406355A j 




prf:2308356A 
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pir:B75633 
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1^ 
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O 
O 

UJ 
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(/) 










pir:T36671 


CO 

CO 
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h- 

ex 


sp:HEPA_ECOU 


si 


CO 
00 
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00 
CD 


1272 


CN 


CO 

cn 


(J) 

CD 
CO 


4800 


CN 
CO 


3699 


CO 

ro 

to 


2433 


1563 


in 

CO 


ro 
<n 
ro 


CD 

cn 

CO 


cn 

fN 
QO 


6207 


4596 1 


2886 


Terminal 
(nt) 


743067 


743900 


745046 


745622 


748442 


(O 

o 


OO 
00 


748886 


757434 ^ 


753697 


757630 


768364 


760906 1 


762853 ! 


763122 


762582 


767367 


763237 


769547 


774150 


Initial 
(nt) 


743900 


744931 


745513 


746893 


748020 


748026 


CO 
00 


753685 


757063 


757395 


758262 


CO 
U) 

o 

CO 

p"- 


762468 


762497 i 


762730 


762977 


768191 


769443 


CM 

. 

-C" 


777035 


SEQ 
NO. 
(aa) 


4303 


4304 
4305 


4306 


4307 


CO 
o 
ro 


4309 


o 
ro 


4311 


4312 


4313 


4314 


4315; 


4316 


4317 


43181 


a> 
ro 


4320 


4321 


CN 
CN 
CO 


SEQ 
NO. 
(DNA) 


ro 

O 

CO 


1 i <n 

1 o . o 

; QO ] CD 


to 
o 

CO 


o 

CO 


CD 
CD 

' CO 

i 


jo, 

CO 


o 

CD 


CO 


CN 

CO 


CO 


IS 

J 


m 

1 


816 
817 


CO 
CO 


O) 

T— 

OO 


1 8 
1 


CO 


CM 
C\J 
00 
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CO 

Tow 

) 

o ^ 

^' *^ c 
■o x> 



Q. 

to nj 

Q. (/) 

' C 



c 

o 

o 



o 

CL 

>> 

x: 



o 

sz 

Q. 



c 
o 



o 
cx 



o 

CL 



c 
o 

E 
o 

ci 



0) 




c 




B 








>> 




u 




0 




E 




0 




x: 




1 

_j 
1 




2»s 




V) 




0 


lA 


c 




0) 


0 


■0 




(0 


T3 




c/) ^ 



E 



TO 

-5 D) d 



CO 



o 



to 



o 



Co 



o 



o 



o 

S 
o 
X 



o 

D 
o 

o o< 



E 
£ 



O ''^ 
O CM 

>> o 
S E 



p o 



CO Q 



2 $ 



5 > 

o ce 



E - 
is 

LU 

i= U 
CO w 



to 52 



^ > 
o 

5 X 



o ^ 
o o 

UI a. 



1 



> 



o ^ 



< < 



■a 



Q 



o 

in 



< 



I 

5 

icL 



u 

CO 

CL 



O 

_j 
< 

Z 
< 



I 

X 

< 

id 



D 
u. 
O 
a: 
< 

X 



O 



o 
o 



o 
o 



CO 
00 



o 
o 



LU y 



n ^ 

LU U 2 



in ID 

CN I C\ 
00 I CO 



CN 
CO 



o 

00 



CO 



CO 

to 

CO 



O) 
CO 
00 
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Function 


two-component system response 
regulator 




two-component system sensor 
histidine kinase 


lipoprotein 


hypothetical protein 




30S ribosomal protein or chloroplast 
precursor 


preprotein translocase SecA subunit 




hypothetical protein 


hypothetical protein 


5-enolpyruvylshikimate 3-phosphate 
j synthase 


hypothetical protein 


5-enolpyruvylshikimate 3-phosphate 
synthase 


hypothetical protein 


RNA polymerase sigma factor 


Matched 
length 
(a.a) 


M 

CN 
CN 




rr 
CO 


in 
oi 
in 


CO 
CN 




CO 

o 

CN 


m 

00 




o 


CN 
CN 
CO 


to 


o 

00 


CO 
CN 


o 

CO 
CO 


CO 
00 


Similarity 
(%) 


90.6 




78.9 


65.6 


72.8 




61.6 


99.6 




78.8 


82.9 


o 
ai 
Oi 


63.9 


100.0 


42.4 


87.2 


Identity 
(%) 


73.7 




53.1 


29.6 


38.0 




34.5 ' 


99.1 




47.1 


64.6 


99.0 


38.3 


100.0 


21.6 


CN 
CO 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv3246c mtrA 




Mycobacterium tuberculosis 
H37RV Rv3245cmtrB 


Mycobacterium tuberculosis 
H37RV Rv3244c IpqB 


Mycobacterium tuberculosis 
H37RV Rv3242c 




Spinacia oleracea CV rps22 


Brevibacterium flavum 
(Corynebacterium glutamicum) 
MJ.233 secA 




Mycobacterium tuberculosis 
H37Rv Rv3231c 


Mycobacterium tuberculosis 
H37RV Rv3228 


Corynebacterium glutamicum 
AS019aroA 


Mycobacterium tuberculosis 
H37RV Rv322ec 


Corynebacterium glutamicum 


Mycobacterium tuberculosis 
H37RV Rv0336 


Mycobacterium tuberculosis 
sigH 


db Match 


< 
o 

CO 

: CN 

1 CN 

! t 




T 

o 

CO 

CN 
CN 

f 


CN 

m 
o 

u. 

CL 


pir.D70592 




sp RR30_SPIOL 


gsp:R74093 




<n 
m 
o 

< 


pirF70590 


CO 
CN 
^ 

U- 

< 

cL 

O) 


o 

a> 
in 

o 

Q 

Ol 


CO 
CN 
rr 

< 

CL 

e) 


pirG70506 


Q 

ro 
CO 
CO 

m 
in 

CN 

t: 

Q. 




00 

1^ 

CO 




1497 


1704 


CO 
00 

in 


CO 

tn 


CO 
(O 
CO 


2535 


CN 
1^ 
(O 


o 
m 


r- 

CO 

cn 


1413 


o 

00 


CN 


1110 


CO 


Terminal 
(nt) 


791409 


790736 
793008 


794711 


795301 


795292 1 


796110 


798784 


799691 


800200 


800208 


801190 

i 


B03128 


802565 


803131 


805025 


Initial 
(nt) 


790732 


791421 


791512 


793008 


794714 


795447 


795448 


795250 


799020 


799697 


801194 


802602 


802649 


802687 


o 

CN 

O 

oo 


804408 


SEQ 

NO. 
(a.a.) 


4340 


4341 


4342 


] 

4343 


4344 


4345 


to 

fO 


CO 


00 

to 




4350 


4351 


4352 


4353 


i 

m 

CO 


4355 


SEQ 
NO 
(DNA) 


o 
■c 

oo 


CO 


CN 
CO 


CO 

TT ^ 
CO 00 


in 

CO 


CO 
00 


00 


oo 
00 


CD 
CO 


o 
m 

00 


in 

CO 


in 

CO 


CO 

in 

00 


• in 

00 


in 
in 

CO 
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Function 


regulatory protein 


hypothetical protein. 


hypothetical protein 


DEAD box ATP-dependent RNA 
heiicase 




hypothetical protein 


hypothetical protein 


ATP-dependent DNA heiicase 




ATP-dependent DNA heiicase 




potassium channel 


hypothetical protein 

I 


DNA heiicase II ) 




hypothetical protein 




Matched 
length 
(aa) 




CN 


to 


00 

in 




CN 


cn 

CN 


1155 




1125 




<N 

o 

CO 


o 

CO 
CN 


o 

CD 
CD 




o 

CO 

CN 




Similarity 
(%) 


96.4 


65.1 1 


CN 
CN 
CD 


64.0 




69.8 


65.9 


o> 

CO 




65.7 




64.2 


58.3 


OO 

cd 
in 




49.3 




Identity 
(%) 


78.6 


33.3 


29.6 


37.3 




46.4 


o 

CO 


23.9 




41.4 




, 26.2 


30.4 


32.6 1 




26.8 




Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv3219 whiB1 


Mycobacterium tuberculosis 
H37RvRv3217c 


Mycobacterium tuberculosis 
H37RV RV3212 


Klebsiella pneumoniae CG43 
deaD 




Mycobacterium tuberculosis 
H37RV Rv3207c 


Mycobacterium tuberculosis 
H37RV Rv3205c 


Mycobacterium tuberculosis 
H37RvRv3201c 




Mycobacterium tuberculosis 
H37RV Rv3201c 




Methanococcus jannaschli JAL- 
1 MJ0138.1. 


Mycobacterium tuberculosis 
H37RV Rv3199c 


Escherichia coli K12 uvrD 




Mycobacterium tuberculosis 
H37RV Rv3196 




db Match 


pir:D70596 


to 

LO 

o 

CO 


pir:E70595 


z 
a 

LU 
-J 

< 
LU 
O 

'6. 
in 




pir:H70594 


pir:F70594 


pir:G70951 




pir:G7G951 




< 

ID 

5 
cn' 

CO 
>- 

b. 
•«/> 


pir:E70951 


sp:UVRD_ECOLI 




pir:B70951 




u 


GO 

in 

CN 


o 

CN 


1200 


1272 


m 

CN 
CN 


CD 
00 


(j> 
m 


00 
TT 

O 
CO 


O 
00 

r*- 


3219 


1332 


1005 




2034 


03 
iO 


CD 
CO 


CO 

o 
to 


Termina! 
(nt) 


805535 


1 806737 


806740 


I 807946 


809510 


810394 


811163 


r-. 

CN 


1 811386 


817422 


814210 


818523 


819236 


821287 


i 822669 


821290 


823391 


Initial 
(nt) 


805792 


806318 


807939 


809217 


809286 


809549 


810405 


811170 

i 


812165 


814204 


in 
in 

CD 


817519 


818523 


; 819254 


822079 


822105 


822789 


SEQ 
NO. 
(a.a.) 


4356 


4357 


4358 


4359 


4360 


to 


4362 


4363 


4364 
4365 


1 <D 
• CD 


r~ 
CD 

ro 
■q- 


! 

4368 


4369 


o 

CO 


CO 

NT 


4372 


SEQ 
NO. 
(DNA) 


to 
m 

00 


to 

00 


00 

KO 
00 


CO 


■ o 




CN 
CO 
00 


CO 
CD 
! CO 


CO 
00 


m 

CD 
CO 


u 


i N- 

1 ^ 
OO 


00 
CD 

CO 


O) 
(D 
00 


o 

CO 


r*- 

GO 


CN 
00 
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Function 


hypothetical protein 


hypothetical protein 






hypothetical protein 


regulatory protein 


ethylene-inducible protein 


hypothetical protein 


hypothetical protein 




alpha-lytic proteinase precursor 




DNA-directed ONA polymerase 


major secreted protein PS1 protein 
precursor 










monophosphatase | 


15 


Matched 
length 
(a.a) 




o 
m 
n 






1023 


CO 
CO 


o 
ro 


GO 


o 

(N 




00 

o 




00 

o 

CN 


CO 
CO 
CO 










in 
in 

CN 


20 


Similarity 


76.4 


74.9 






73.5 


57.7 


89.0 


53.0 


73.6 








51.4 


51.5 










74.9 




Identity 


42.6 


43.4 







47.2 


34 3 


67.4 


49.0 


40,8 




26.7 




25.0 


27.0 










51.8 


25 

o 

c 

c 
o 

30 ^ 

35 

40 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv3195 


Mycobacterium tuberculosis 
H37RV Rv3194 






Mycobacterium tuberculosis 
H37RV Rv3193c 


Deinococcus radiodurans 
DR0840 


Hevea brasiliensis laticifer er1 


Aeropyrum pernix K1 APE0247 


Bacillus subtilis 168 yaaE 




Lysobacter enzymogenes ATCC 
29487 




Neurospora intermedia LaBelle- 
1b mitochondrion plasmid 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 










Streplomyces alboniger pur3 


db Match 


pir:A70951 


pir:H70950 






o 
in 

o 
O 
a. 


«' 

ro 
0> 

o 
o 

UJ 

< 

cn 


a 

CQ 

> 

LU 
X 

1 

£ 

LU 

iCL 
(/) 


CN 
CO 

CN 

r>- 

U- 

q: 

Ol 


C/3 
U 
< 
CL 

1 

LU 
>- 

id 
<n 




pir.TRYX34 




CN 
CN 

r-- 

CO 

o 

C/3 

'o. 


sp:CSP1__CORGL 










X 

CO 
CN 

r*- 
o 

CN 
CN 

r 

CX 






to 


; 1050 


in 
CO 


CN 
CN 

in 


2955 


1359 


in 

O) 


CO 


o 
§ 


ro 
^ 


1062 


o 
in 


m 

00 

m 


1581 


cn 

CN 


o 
m 


CN 


a> 
O 

CO 


O 
00 


45 


Terminal 
(nt) 


8226B0 


825239 


1 825242 


I 825996 


829570 


829627 


931971 


831578 


832570 


832795 


834633 


835388 


835837 


838892 


839353 


840139 


! 840210 


1 840437 


841517 


50 


Initial 
(nt) 


in 

CN 

'cr 

CM 

CO 


824190 


825916 


! 826517 


825616 


830985 


831021 


831922 


831971 


833157 


833572 


CO 
CD 
CO 
^ 
CO 
CO 


835253 


837312 


838925 


839630 


! 840431 


I 840745 


842296 




SEQ 
NO. 
1 (a.a.) 


4373 


CO 


4375 


4376 


4377 


4378 


4379 


4380 


4381 


4382 


4383 


V 
00 

r> 


4385 


4386 


4387 


4388 


4389 


4390 


|4391 


55 


SEQ 
NO. 
(DNA) 


n 
cc 


00 


m 
co 


CO 
00 


05 


00 
00 


CJ> 
00 


o 

CO 
00 


a> 

00 


(N 
00 
00 


CO 
CO 
00 


CO' 
00 


. in 

00 
00 


CO 
m 

GO 


00 
CO 

1 


00 
CO 
CO 


cn 
CO 

00 


o 

00 


<— 

o> 

00 
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ft) 
a 
c 

c 

8 



0) 

n 

.to 



Function 


myo-inosito! monophosphatase 


peptide chain release factor 2 

— — 


cell division ATP-binding protein 


hypothetical protein 


cell division protein 


small protein B (SSRA-binding 
protein) 


hypothetical protein 






vibriobactin utilization protein 


Fe-regulated protein 


hypothetical membrane protein 


ferric anguibadin-binding protein 
precursor 


ferrichrome ABC transporter 
(permease) 


ferrichrome ABC transporter 
(permease) 


ferrichrome ABC transporter (ATP- 
binding protein) 


Matched 
length 
(aa) 


CO 
CN 


o> 
tr> 
r) 


to 

CN 
CN 


CM 
1^ 


o 

CO 


in 

V 


(D 






rM 

CM 


CJ) 

T— 

CO 


r- 

o> 


in 

(N 
CO 


CO 
CO 


CM 
CO 


o 
m 

CN 


>^ 


































1^ 


59.3 


88.6 


91.2 


54.0 


74.8 


75.9 


73.3 






CJ^ 
CM 

m 


58.3 


71.2 


in 

CD 


80.8 


76.0 


82.0 


c/} 


































1 Identity 
{%) 


ro 
ro 


68.0 


70.4 


o 

CO 


in 
d 


in 
to 


44.0 






26.8 


29.5 


36.1 


27.7 


39.3 


35.6 1 


48.4 


Homologous gene 


Streptomyces flavopersicus 
spcA 


Streptomyces coelicolor A3(2) 
prfB 


Mycobacterium tuberculosis 
H37RV Rv3102cttsE 


Aeropyrum pernix K1 APE2061 


Mycobacterium tuberculosis 
H37RV Rv3101cftsX 


Escherichia coli K12 smpB 


Escherichia coli K12 yeaO 






Vibrio cholerae OGAWA 395 
viuB 


Staphylococcus aureus sirA 


Mycobacterium leprae 
MLCB1243.07 


Vibrio anguillarum 775 fatB 


Bacillus subtilis 168 yclN 


Bacillus subtilis 168yclO 


Bacillus subtilis 168 yclP 


db Match 


to 

CO 

o 

D 
d. 
o> 


O 
u 

1- 
co 

cn' 

u. 
(X 

Q. 
in 


pir:E70919 


PIR:G72510 


pir:D70919 


_j 
O 
o 

LU 

1 

CD 

CL 

C/) 
d. 

iA 


-J 

O 
o 

o' 

< 
HI 
>- 

d. 
%n 






X 

o 

CO 

> 

1 

03 
D 
> 

o. 

in 


< 

(O 

CO 

o 
in 

CM 

f 
cx 


gp:MLCB1243_5 


sp:FATB_VIBAN 


pir:B69763 


pir:C69763 


pir:D69763 


li 


CO 


1104 


r-- 
co 

U3 


CD 
CN 


o 
o 

CD 


CM 

cn 


in 
m 


< 

CO < 

in ( 


□ in 

D O 


in 

CM 
00 


oo 


00 

00 

in 


1014 


O) 

cn 
cn 


CM 

■c 

CD 


ro 
in 
r-- 


Terminal 
(nt) 


842306 


844360 


845181 


(N 
«o 


846097 


CO 

c>* 

iO 
CD 
XT 
00 


846982 


846269 


847718 


1 

848499 


849326 


850412 


852364 


853616 


854724 


855476 


Initial 
(nt) 


843124 


: 843257 

1 


in 

CD 
CO 


845105 


845198 


846137 


846632 


846805 


CN 

"M CM 

T— 

CO 

CO oo 


849323 


850243 


850999 1 


851351 


852618 


853783 


CM 

in 

CO 


(/) 2 


j4392 


4393 


"C 
O) 
ro 


4395 


4396 


4397 


43981 


4399 


O O 


4402 


CO 

o 


4404 


in 
o 


4406 


4407 


CO 

O i 


Sol 

(/3 ^ 9. 


CM 

O) 
CO 


ro 
CO 


CD 
CO 


tn 

O) 


! cr> 

1 " 


CD 

i " 


898 


668 


a T- 

o o 


fN 
O 
O) 


(O 

o 
cn 


O 

cr> 


in 
o 

Q} 


s 

C7) 


o 

CJ) 


1 

§ 
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3 



0) 



Function 


hypothetical protein 


hypothetical protein 


kynurenine 

aminotransferase/glutamine 
transaminase K 




DNA repair helicase 


hypothetical protein 


hypothetical protein 




resuscitation-promoting factor 


cold shock protein 


hypothetical protein 


glutamine cyclotransferase 


1 




permease 




? 

d) </) 
c « 

i| 
^ s 

TO 

¥ 1 




Matched 
length 
(aa) 


CO 




CM 




CO 
CD 


CD 


p- 
in 




oo! 

o> • 


CD 


Ol 
in 


CO 
CM 










1 

cn 

CO 




Similarity 
(%) 


72.0 


66.0 


64.9 




62.3 


65.2 


62.0 




64.7 1 


in 


58.5 


67.8 






79.3 




51.7 




Identity 
(%) 


66.0 


61.0 


33.5 




30.7 


36.1 


44.0 




39.4 


42.6 ! 


28.3 


41.8 ' 






to 
CO 




27.9 




Homologous gene 


Chlamydia muridarum Nigg 
TC0129 


Chlamydia pneumoniae 


Rattus norvegicus (Rat) 




Saccharomyces cerevisiae 
S288C YIL143C RAD25 


Mycobacterium tuberculosis 
j H37RV Rv0862c 


Mycobacterium tuberculosis 
H37RV RV0863 




Micrococcus luteus rpf 


Lactococcus lactis cspB 


Mycobacterium leprae 
MLCB57.27C 


Deinococcus radiodurans 
DR0112 






Streptomyces coelicolor A3(2) 
SC6C5.09 




Streptomyces azureus tsnR 




db Match 


r>- 

CO 

r- 

r- 
CO 
LL 

cd 
a 


GSP:Y35814 


pir;S66270 




f- 
co 
< 

UJ 

>- 

I 

id 


pir:F70815 


pirG70815 




< 

CM 
O 

in 
o 

CM 
rr 
CM 

t 
CL 


< 

s 

CM 
CO 
CM 
t 
Q. 


gp:MLCB57_11 


gp:AE001874_1 






O) 

in' 
O 
to 
o 

CO 

iCL 
Ol 




sp:TSNR_STRAZ 




It 


p- 


CO 
CN 


cn 
o 

CM 


: cn 
m 

CO 


1671 


2199 


CN 


ro 

CO 


r*- 

<n 
in 


00 
CO 


in 

CM 

m 




cn 
o> 

CO 


00 
CO 


1473 


CM 
O) 


00 
CN 
00 


CD 

h- 
oo 


Terminal 
(nt) 


860078 


860473 


862752 


1 862753 


863396 


865119 


867571 


868630 


867803 


869318 ; 


869379 


869918 


870721 


871660 


873210 


872016 


O 

o 

p^ 
00 


874069 


initial 
(nt) 


860224 


in 
o 

CD 
00 


ID 

CO 


863391 


865065 


867317 

1 


867353 


867788 


868399 


868938 
869903 


870691 


871419 


871523 
871738 


872927 1 


873213 


p- 

CO 


C/) ^ 3 


4409 


4410 
4411 


CM 


4413 


4414 


4415 


4416 




CO ! <ji 

! TT 

^ 1 


o 

CM 


CM 


4422 


4423 


CM 


1 s 

1 ? 


4426 


(/) z 9 


<ji 
o 


o 

CD 

1 


! ^ 

J ^ 


CN 

CD 


I CO 

1 




in 


(D 

; CD 


cn 


(CO cn 

! o j cn 


o 

CN 




CM 

cn 


1 CO 

j o. 


CM 


in 
cs; 
cn 


I 
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Function 


hypothetical protein 


phosphoserine transaminase 


acetyl-coenzyme A carboxylase 
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hypothetical protein 
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ribosomal-protein-alanine N- 
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hypothetical membrane protein 
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transcriptional regulator (tetR 
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hypothetical membrane protein 


two-component system sensor 
histidine kinase 


two component transcriptional 
regulator (luxR family) 




hypothetical membrane protein 


ABC transporter 




ABC transporter 


gamma-glutamyltranspeptidase 
precursor 










transposase protein fragment 
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Function 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


multidrug resistance-like ATP- 
binding protein, ABC-type transport 
protein 


ABC transporter 


hypothetical membrane protein 




hypothetical protein 






IpqU protein 


enoiase (2-phosphoglycerate 
dehydratase)(2-phospho-D- 
glycerate hydro-lyase) 


hypothetical protein | 


hypothetical protein 


hypothetical protein 


guanosine pentaphosphatase or 
exopolyphosphatase 




threonine dehydratase 
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Homologous gene 


Neisseria gonorrhoeae 


Escherichia coli mdlB 


Mycobacterium tuberculosis 
H37RV Rv1273c 


Corynebacterium glutamtcum 
ATCC 13032 orf3 




Bacillus subtilis yabN 






Mycobacterium tuberculosis 
H37RV Rv1022lpqU 


Bacillus subtilis eno 


Aeropyrum pernix K1 APE2459 


Mycobacterium tuberculosis 
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Mycobacterium tuberculosis 
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Function 




hypothetical protein 


transcription activator of L-rhamnose 
operon 


hypothetical protein 




hypothetical protein 


transcription elongation factor 


hypothetical protein 


lincomycin-production 




3-deoxy-D-arabino-heptuIosonate-7- 
phosphate synthase 




hypothetical protein or untiecaprenyl 
pyrophosphate synthetase 


hypothetical protein 






pantothenate kinase 


serine hydroxymethyl transferase 


p-aminobenzoic acid synthase 
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Function 


biotin carboxylase 












hypothetical protein 


magnesium chelatase subunit 


2.3-PDG dependent 
phosphoglycerate mutase 


hypothetical protein 


carboxyphosphonoenolpyruvate 
phosphonomutase 


1 tyrosin resistance ATP-binding 
protein 


[hypothetical protein 


alkylphosphonale uptake protein 


transcriptional regulator 


multi-drug resistance efflux pump 


transposase (insertion sequence 
IS31831) . 
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Table 1 (continued) 


Homologous gene 


Synechococcus sp. PCC 7942 
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Mycobacterium tuberculosis 
H37RV Rv0959 


Rhodobacter sphaeroides ATCC 
17023 bchI 


Amycolatopsis methanolica pgm 


Mycobacterium tuberculosis 
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■ Streptomyces hygroscopicus 
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Escherichia coli K12 MG1655 
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i Streptococcus pneumoniae 
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Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
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Function 


excinuclease ABC subunit A 


thioredoxin peroxidase 






hypothetical membrane protein 


oxidoreductase or thiamin 
biosynthesis protein 




! 

i 






chymotrypsin Bll 


arsenate reductase (arsenical pump 
modifier) 


hypothetical memt^rane protein | 


hypothetical protein 


hypothetical protein 


GTP-binding protein (tyrosine 
phsphorylated protein A) 


hypothetical protein 


hypothetical protein 
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Table 1 (continued) 


Homologous gene 


Thermus thermophilus unrA 


Mycobacterium tuberculosis 
H37RV tpx 
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Streptomyces coelicolor A3(2) 
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Function 


hypothetical protein 


i ATPase 


hypothetical protein 


hypothetical protein 


hypothetical protein 






2-oxoglutarate dehycirogenase 


ABC transporter or multidrug 
resistance protein 2 (P-glycoprotein 
2) 


hypothetical protein 


shikimate dehydrogenase 


para-nitrobenzyl esterase 








tetracycline resistance protein 


metabolite export pump of 
tetracenomycin C resistance 
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Mycobacterium tuberculosis 
H37RV Rv1224 


Escherichia colt mrp 


Mycobacteriuni tuberculosis 
H37Rv Rvl231c 


i Mycobacterium tuberculosis 
IH37RV Rv1232c 


Mycobacterium tuberculosis 
H37RV RV1234 






Corynebacterium glutamicum 
AJ12036 OdhA 


Cricetulus griseus (ChinesiB 
hamster) MDR2 
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Function 


DEAD box ATP-dependent RNA 
helicase 


bacterial regulatory protein. tetR 
family 


pentachlorophenol 4- 
monooxygenase 


maleylacetate reductase 


catechol 1.2-dioxygenase 




hypothetical protein 


transcriptional regulator 




hypothetical protein 


phosphoesterase 


hypothetical protein 






esterase or lipase 
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length 
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oo 

CO 




r) 
o 

CM 


in 
o 

CO 


in 

O) 






CD 
CN 
CM 






Similarity 
(%) 


74.3 


47.4 


47.7 


72.0 


59.4 




58.4 


55.4 




56.2 


67.3 


59.6 






64.6 






Identity 
(%) 


48.1 


24.7 


24.5 


40.4 


30.6 




31.9 


24.9 




29.6 


39.2 


29.7 






37.3 






Homologous gene 


j Klebsiella pneumoniae CG43 
DEAD box ATP-dependent RNA 
! helicase deaD 


Mycobacterium leprae 
B1308_C2_181 


Sphingomonas flava pcpB 


Pseudomonas sp. B13 cIcE 


Acinetobacter calcoaceticus 
catA 




Mycobacterium tuberculosis 
H37RV Rv2972c ! 


Saccharomyces cerevisiae 
SNF2 




Streptomyces coelicolor A3(2) 
orfZ 


Mycobacterium tuberculosis 
H37RV RV1277 


Mycobacterium tuberculosis 
H37Rv Rv1278 






Petroleum-degrading bacterium 
HD-1 hde 




j 


db Match 


2 

a 

LU 

_j 

q' 
< 

UJ 

Q 

id 
to 


prf:2323363BT 


sp:PCPB_FtJKS3 


sp:CLCE_PSESB 


< 

o 

o 

< 

; O 

i ^ 




pir:A70672 


\- 
< 

LU 

> 

cm' 

LL 
(/3 

CL 




gp:SCO007731_6 


pir:E70755 


h- 
O 
> 

CO 

o 

> 

CL 
UJ 






CD 

cn 

CO 

CN 
O 

CO 

< 

id 




— !. 




2196 


<o 


1590 


1068 


00 
00 




o 
in 


3102 


1065 


CO 

in 
oo 


1173 


2628 


<D 
O 

r) 


CO 

r> 


r- 


00 
CO 


CD 
CO 


Terminal 
(nt) 


1212129 


1212429 


1214858 


1215938 j 


1216836 


1216904 




1222996 1 


00 

CN 
CN 


1223843 


1225059 


1227693 


1227282 


1227340 


1228636 


1229095 


1229935 


Initial 
(nt) 


1209934 


1213115 

1 


1213269 


1214871 


1215952 


1 1217374 


1217982 


1219895 


1222905 


1222986 


1223887 


1225066 


1227587 


i 1227657 


1227863 


1228718 


1229150 


SEQ 
NO. 
(a a.) 


4778 
4779 


4780 


co 


4782 


|4783 


00 


4785 


4786 


4787 


4788 


4789 


4790 


O) 


4792 


4793 


4794 1 




CO 
■ CM 


CJ) 


o 

CO 
CM 


co 

CN 


CN 
00 
CN 


1283 


1284 


in 

00 
CM 


CD 

CO 00 
CN CN 


00 
00 
CN 


C3) 
CO 
CM 


o 

C7> 
CM 


1291 


CM 
O) 
CM 


CO 
CJ) 
CN 


C3) 
CN 
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5 
10 


Function 


short-chain fatty acids transporter 


regulatory protein 






fumarate (and nitrate) reduction 
regulatory protein 


mercuric transort protein periplasmic 
component precursor 


zinc-transporting ATPase 2n{II)- 
translocating P-type ATPase 


jGTP pyrophosphokinase (ATP:GTP 
' 3'-pyrophosphotransferase) (ppGpp 
synthetase 1 ) 


tripeptidyl aminopeptidase 






homoserine dehydrogenase 






nitrate reductase gamma chain 


nitrate reductase delta chain 


nitrate reductase beta chain 


hypothetical protein 


1 hypothetical protein 


nitrate reductase alpha chain 


nitrate extrusion protein | 


15 


Matched 
length 


(N 
CM 

T- 


(O 
CO 






CO 
CM 
CM 


00 


m 
o 

CO 


(*) 


o 

CD 






CM 






o 

CN 
CN 


in 
1^ 


in 
o 
m 


r- 
<o 


CO 
GO 


CN 


CO 


20 


Similarity 
(%) 


69.7 


CO 

to 
in 






57.9 


66.7 


70.6 


58.4 


49.3 






98.0 






69.6 


63.4 


83.4 


48.0 


55.0 


73.8 


cn 
CO 




Identity 


37.7 


24.7 






25.0 


33.3 


38.0 


32.9 


26.6 






95.0 






45.0 


ro 

CD 

ro 


56.6 


36.0 


36.0 


46.9 


32.8 


<y 

C 

o 

u 

50 7 

£1 

35 
40 


Homologous gene 


Streptomyces coelicolor 
SC1C2.14catoE 


EPivinia chrysanthemi recS 






Escherichia coli K12 MG1655 fnr 


Shewanella putrefaciens merP 


Escherichia coli K12 MG1655 
atzN 


Vibrio sp. S14 relA 


Streptomyces lividans tap 






Corynebacterium glutamicum 






Bacillus subtiiis narl 


Bacillus subtiiis narJ 


Bacillus subtiiis narH 


Aeropyrum pernix Kl APE1291 


Aeropyrum pernix Kl APE1289 


Bacillus subtiiis narG 


Escherichia coli K12 narK 


db Match 


_j 
O 
O 

UJ 

1 

UJ 

O 

H 
< 

CL 


X 

o 

LU 

1 

CO 

o 

LU 
Q. 

iaL 
in 






-J 
O 
o 

1 UJ 

' K 

u_ 

id 
1/) 


D 
Q. 

UJ 
X 

CO 

1 

a. 

q: 

UJ 

S 

ci 


..... j 
sp:ATZN_ECOLI 


sp:RELA_VIBSS 


o 

8 

CO 

a 

Q. 
1/1 
D) 






C7] 
CO 

a 

Ql 
CO 

o 






D 
CO 

CO 

1 

< 
2 
b. 

</) 


r) 

CO 

o 
< 

2 

cL 
w 


sp:NARH_BACSU 


PIR:D72603 


PIR:B72603 


CO 

o 
< 

CD 

1 

O 
a. 

V) 


sp:NARK_ECOLI 




gi 


ro 
\n 


CO 


C\J 
CM 
CM 


ai 
\n 


o 
in 
h- 


to 

CM 


1875 


CO 


1581 


CO 

o 


o 

CM 


00 

o 


1260 


o 

Ol 
CO 


1^ 


CM 
CO 


1593! 


C7> 

in 


CO 
CN 


3744 


1350 


45 


Terminal 
(nt) 


1229180 


1230480 


1230831 


1230914 


1232479 


1232836 


1234881 

1 


1235612 


1236545 i 


1241554 1 


1242156 


1243728 


1243942 ' 


1244843 


o 

CN 

in 

CN 


1246508 


1247199 


1250444 


1251817 


1248794 


1252557 


50 


Initial 
(nt) 


1229716 


1229995 


1230610 


1231432 


1231730 


1232603 


i 1233007 

i 


1234983 


1238125 


to 
m 

CN 
CM 


1242275 


1243621 


1245201 1 


1245532 


CD 
CJJ 

to 

CN 


<7) 
CO 
CM 

V 
CN 


1248791 1 


1249851 


1251545 


1252537 


1253905 




SEQ 
NO. 
(a.a.) 


4795 


4796 


4797 


4798 


4799 


14800 


o 

CO 


4802 


4803 


4804 1 


4805 


4606 


4807 


4808 


4609 


4810 


4811 1 


4812 


4613 


4814' 


4815 


55 




1295 


1296 


h- 
Qi 
CM 


00 
Gi 
(\ 


Oi 
Oi 
CM 


o 
o 

CO 


o 
ro 


CM 
O 
CO 


CO 

o 
ro 


S 
ro 


in 
o 

CO 


g 


o 

CO 


CO 

o 

CO 


<Ti 
C3 

ro 


o 
to 


CO 


CN 
CO 


ro 

CO 


?> 
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3 



Function 


molybdopterin biosynthesis cnx1 
protein (molybdenum cofactor 
biosynthesis enzyme cnx1) 


extracellular serine protease 
precurosor 




hypothetical membrane protein 


hypothetical membrane protein 


molybdopterin guanine dinucleotide 
synthase 


molybdoptein biosynthesis protein 


molybdopterin biasynthsisi protein 
Moybdenume (mosybdenum 
cofastor biosythesis enzyme) 


edium-chain fatty acid-CoA ligase | 


Rho factor 








peptide chain release factor 1 


protoporphyrinogen oxidase 




hypothetical protein 


undecaprenyl-phosphate alpha-N- 
acetylglucosaminyltransferase 


Matched 
length 
(a.a) 




00 
CO 




CO 
CO 


CM 


00 

r- 


CO 
CO 
CO 


in 

CO 


CM 

in 


CO 

in 








CO 
CO 
CO 


o 

00 
CN 




vn 

r- 
CN 


CN 
CN 
CO 


Similarity 
(%) 


65.0 


45.9 




62.6 


60.2 


52.3 


58.2 


73.7 


65.7 


73.8 








71.9 


57.9 




86.0 


58.4 


Identity 
(%) 


in 
oi 
cn 


21.1 




30.8 


31.6 


27.5 


32.8 


51.4 


36.7 


50.7 








41.9 


31.1 




62.3 


31.1 


Homologous gene 


Arabidopsis thaliana CV cnx1 


Serratia marcescens strain IFO- 
3046 prtS 




Mycobacterium tuberculosis 
H37RV Rv1841c 


Mycobacterium tuberculosis 
H37RV Rv1842c 


Pseudomonas putida mobA 


Mycobacterium tuberculosis 
H37RV Rv0438c moeA 


Arabidopsis thaliana cnx2 


Pseudomonas oleovorans 1 


Micrococcus luteus rho 








i Escherichia coli K12RF-1 


Escherichia coli K12 | 




Mycobacterium tuberculosis 
H37Rv Rv1301 


Escherichia coli K12rfe 


db Match 


T 
X 

z 
o 

CL 

to 


sp:PRTS_SERMA 




O 
>- 

^' 

Q 
o 
> 

iCL 

\n 


h- 
O 

1 

o 
> 

id. 
tn 


^1 

CN 

in 

O) 
CN 

04 

3 • 
Q- 
Q- 
cL 
cn 


-J 
O 
U 

LU 

<' 
UJ 

O 

Q. 
Wl 


I 
1- 
< 

DC 
CN 

X 

z 
u 

id 

V) 


sp:ALKK_PSEOL 


D 
-J 
O 

5 
X 

oc 

CL 
V) 








_j 
O 
o 

UJ 

I 

LL 

CL 

<n 


-J 
O 
U 

UJ 

5 

UJ 
X 

CL 




sp:YD01_MYCTU 


_j 
O 

o 

UJ 

UJ 
LL 

oc 

d 
cn 


si. 


O) 
CO 

•V 


9981 


00 
CD 


1008 


1401 ' 


r- 
CO 

m 


1209 1 


1131 


1725 


2286 


CO 

o 

CO 


CO 

en 

CO 


1023 


1074 


CO 
GO 


^ CD 

r»>^ . to 


CO 


Terminal 
(nt) 


1254634 


1254737 


1257750 


m 

CO 
CO 

in 

Csl 


1257865 


1259429 


1259993 


1261688 


1262386 


' 1267427 


1266267 


1265611 


1 1265427 


1268503 


1269343 


1268267 


1270043 


1271192 


Initial 
(nt) 


iD 
m 

CM 


1256602 


1257067 


1257858 


1259265 


1259989: 


1261201 


1262818 


1264610 


1 1265142 


1265665 1 


1266306 


U) 

•<r 
rr 

CO 
CO 
CN 


o 

CO 
CO 

eg 


1268507 


1269040 


1269396 


1270047 


w 2 


to 


4817 


4818 


J— 


o 

CN 
CO 


4821 


CN 
CN 
CO 


<n 

CN 
OO 

N- 


CN 

CO 


4825 


CO 
(N 
CO 

''f 


CN 
CO 


4828 


4829 


'4830 


4831 


4832 


4633 


Sol 


CO 

n 


<o 


CO 
CO 


<j) 

CO 


o 

(N 
CO 


CN 
CO 


CN 
CN 
CO 


CO 
CN 
CO 


CN 
CO 


in 

fN 
CO 


CO 
CN 
CO 


CN 
CO 


00 
Csl 
CO 


C7> 
CN 
CO 


o 

CO 
CO 


1331 


CN 
CO 
CO 


<n 

CO 
CO 
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Function 




hypothetical protein 


ATP synthase chain a (protein 6) | 


H+-transporting ATP synthase lipid- 
binding protein. ATP synthase C 
chane 


H+4rans porting ATP synthase chain 
b 


H+-transpofting ATP synthase delta 
chain 


H+-transpofting ATP synthase alpha 
chain 


H+-(ransporting ATP synthase 
gamma chain 


H+-trans porting ATP synthase beta 
chain 


H+-tran sporting ATP synthase 
epsilon chain 


hypothetical protein 


hypothetical protein 


putative ATP/GTP-binding protein 


hypothetical protein j 


hypothetical protein 


i 

0) 
O 

\E 


15 


Matched 
length 
(aa) 




o 

CO 


in 






CN 


CO 

in 


a 

CN 
CO 


CO 
CO 


CN 
CN 

T— 


CN 

CO 


o 

CO 
CN 


in 
a> 


CO 


o 


o 

CO 




Similarity 
(%) 


































20 




99.0 


56.7 


85.9 


Oi 
(6 

iD 


67.2 


88.4 


76.6 


100.0 


73.0 


67.4 


85.7 


56.0 


68.7 


79.2 


71.4 




Identity 
(%) 




98.0 


1 24.1 


54,9 


27.8 


34.3 


Oi 

!S 


46.3 


99.8 


41.0 


38.6 


70.0 


o 

lb 


35.8 1 


54.5 


37.9 


Table 1 (continued) 


Homologous gene 




Corynebacterium glutamicum 
atpl 


Escherichia coli K12 atpB 


Streptomyces lividans alpL 


Streptomyces lividans atpF 


Streptomyces lividans atpD 


Streptomyces lividans atpA 


Streptomyces lividans atpG 


Corynebacterium glutamicum 
AS019 atpB 


Streptomyces lividans atpE 


Mycobacterium tuberculosis 
H37RV Rv1312 


Mycobacterium tuberculosis 
H37RV Rv1321 


Streptomyces coelicolor A3(2) 


Bacillus subtilis yqjC | 


Mycobacterium tuberculosis 
H37Rv Rv1898 


Mycobacterium tuberculosis 
H37RV Rv1324 


35 
40 


db Match 




(N 
O 

m 
< 
D 
CL 
O 


sp:ATP6_ECOLI 


-J 
a: 

W 

J 
CL 

< 

CL 

in 


—1 
q: 
1- 
co 

1 

u. 

a 
< 

Cl 


-J 

DC 
1- 
W 

a' 

Q- 
< 

iCL 

w 


-J 
cn 
\- 

<' 

< 
a, 
*n 


spATPG_STRLI 


a 
O 

o 

1 

oa 

< 

cL 
in 


sp:ATPE_STRLI 


D 

h- 

> 

i 

o 
>• 

d. 


D 
1- 

O 
>- 

CO 

o 
> 

Cl 
in 


in 

CO 

in' 

O 

CD 

CN 
O 

w 
a 
O 


ID 
W 

o 
< 

CD 

o' 

-o 
O 
> 

b. 
in 


I- 
U 
> 

CN 

o 
> 

CL 
(/) 


sp:YD24_MYCTU 




u 


CD 
CO 
M- 


o> 

CN 


o 

CO 


O 
CN 


CO 

m 


CO 

T— 

00 


1674 


m 
cn 


1449 


CN 
CO 


N- 


O 

<j> 
CO 


in 

00 
CN 


CO 

m 
^ 


CN 
CO 


CN 


45 


Terminal 
(nt) 


1271698 


1272119 


1273149 


1273525 


CN 
CN 

CN 


CO 
CN 


1276648 


1277682 


1279136 


1279522 


1280240 


1280959 


1281251 


128 1262 


1282105 


T— 

CO 
00 
<N 


50 


Initial 
(nt) 


1 1271213 


1271871 j 


1272340 


1273286 


1273559 


1274131 


1274975 


1276708 


1277688 


1279151 


1279770 


1280270 


1280967 


1281714 


1281794 


Oi 

CM 
00 
CN 




SEQ 
NO 
(aa.) 


CO 
CO 


4835 


CD 
CO 
CO 


4837 


4838 


4839 


4840 


4841 


4842 


4843 


4844 


4845 


4646 


4847 


4848 


00 

^ 


55 


SEQ 
NO. 


■<T 
CO 
CO 


1335 


CD 
CO 
CO 


CO 
CO 


CO 

CO 
CO 


C7> 
CO 
CO 


o 

N^ 
CO 


CO 


CM 
CO 


ro 
CO 


N- 

cO 


to 

CO 


CO 
CO 


r- 
ro 


CD 
CO 


TT 
CO 



106 



8/17/2007, EAST Version: 2.1.0.14 



EP1 108 790 A2 



3 



Function 


FMNH2-dependent aliphatic 
sulfonate monooxygenase 


alphatic sulfonates transport 
permease protein 


alphatic sulfonates transport 
permease protein 


sulfonate binding protein precursor 


1,4-alpha-glucan branching enzyme 
(glycogen branching enzyme) 


alpha-amylase 




ferric enterobactin transport ATP- 
binding protein or ABC transport 
ATP-binding protein 


hypothetical protein 


hypothetical protein 




electron transfer flavoprotein beta- 
subunit 


electron transfer flavoprotein alpha 
subunit for vanous dehydrogenases 




nitrogenase cofactor sythesis protein 




hypothetical protein 


Matched 
length 
(aa) 


ro 


o 

CN 


CO 
CN 
CN 


CO 


o 


CO 




r" 
(N 


o 

(O 
CN 


1^ 




s 


in 
ro 
ro 




in 

CO 




(V. 

O) 
CO 


Sinrtilarity 


74.3 


75.8 


72.8 


62.1 


CN 


50.5 




87.6 


68.5 


70.0 




64.8 


61.8 




67.7 




55.7 


Identity 
(%) 


50.3 


CO 

d 


d 


35.1 


CD 


22.9 




31.8 


39.6 


43.1 




CN 
CO 


33.1 




35.2 1 




29.5 


Homologous gene 


Escherichia coli K12 ssuD | 


Escherichia coli K12 ssuC 


Escherichia coli K12 ssuB i 


Escherichia coli K12 ssuA 


Mycobacterium tuberculosis 
H37RV Rv1326cglgB 


Dictyoglomus Ihermophilum 
amyC 




Escherichia coli K12fepC 


Mycobacterium tuberculosis 
H37RV Rv3040c 


Mycobacterium tuberculosis 
H37Rv Rv3037c 




Rhizobium meliloti fixA 


Rhizobium meliloti fixB 




Azotobacter vinelandii nifS 




Rhizobium sp. NGR234 plasmid 
pNGR234a y4mE 


db Match 


gp EC0237695_3 


-J 
0 

o 

UJ 

1 

u 

D 

CO 

to 
d. 

tA 


sp:SSUB_ECOLI 


-J 
O 
o 

UJ 

<' 

<0 
CO 

CL 

to 


_j 

O 
o 

UJ 

1 

CD 

CD 
_j 
O 
Id 
t/) 


X 
1- 
o 

Q 
> 

< 

ci. 
in 




_j 
O 
(J 

'^i 

o 
a 

LU 
LL 

Q. 

(/> 


pir:C70860 


o> 
in 

00 

O 

X 

l: 

Q. 




LU 

S 

X 

a: 

i 

LL 
6. 
U) 


LU 
X 

"^1 

m 

X 

LL 

'6. 

Ui 




> 
O 

^1 
CO 

u. 
z 

cL 




Z 

C/) 
X 

d: 

1 

LU 

5 
> 

icL 


u 


1143 


00 
CD 


O) 
CN 


in 

di 


2193 


1494 


00 
ro 


cx> 


^j- 
o 

00 


1056 


CD 


CD 
00 


t— 
m 

O) 


in 
<o 


1128 


CM 
r— 
CO 


1146 


Temrjinal 
(nt) 


1284466 
1285284 


1286030 


1286999 


1287281 


1289514 


1291373 


1292577 


m 

8 

O) 

CNJ 


1295206 


1294436 


1296220 


1297203 


1297093 


1298339 1 


CN 

CO 
00 
<Tt 
CN 


1299000 


Initial 
(nt) 


1283324 


1284517 


12B5302 


1286043 


1289473 


1291007 


1291026 


— 1 
1291699 


1293222 


in 

cn 

CN 


1295047 


in 

CO 

^ 

in 


1296253 


1296479 


1297212 


1298653 


1300145 


SEQ 
NO. 
(a.a.) 


4650 


un 

CO 


4852 


<o 
in 

OO 


4854 


4855 


CD 

in 

CD 


4857 


CO 

in 

00 

XT 


4859 


4860 


£ 

CO 


4862 


4863 


4864 


4865 


1 

4866 


SEQ 
NO. 
(DNA) 


1350 


1351 
1352 


m 

CO 


lO 
fO 


in 
in 

CO 


CD 

m 

CO 


in 
ro 


CD 

in 

CO 


! m 

CO 


o 

CO 
CO 


CO 
ro 


CNi 

CD 
CO 


CO 
CD 
CO 


CD 
CO 


in 

CD 
CO 


1365 
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C 
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Function 


transcriptional regulator 


acetyltransferase 








tRNA (5-methylaminomethyl-2- 
thiouridylate)-methyltransferase 




hypothetical protein 


tetracenomycin C resistance and 
export protin 




DNA ligase 

(polydeoxyribonucleotide synthase 


h y potheti ca 1. protein 


glutamy|.tRNA(Gln) 
amidotransferase subunit C 


glutamyMRNA(GIn) 
amidotransferase subunit A 


vibriobactin utilization protein / iron- 
chelator utilization protein 


hypothetical membrane protein 


pyrophosphate— fructose 6- 
phosphate 1-phosphotransrefase 


Matched 
length 
(aa) 


O) 










CD 
CO 




CN 
CO 
CO 


o 

O 
lO 




(0 


o 

CN 
CN 


cn 


00 


CO 
CO 
CN 


CD 


oo 
m 

CO 


Similarity 
(%) 


76.3 


55.3 








80.9 




66.0 


65.8 




70.6 


70.9 


o 

CO 


83.0 


54.0 


79.2 


77.9 


Identity 
(%) 


m 


GO 








00 
(0 




33.7 


30.2 




42.8 


40.0 


53.0 


o 


28.1 


46.9 


54.8 


Homologous gene 


Rhizobium sp. NGR234 plasmid 
pNGR234a Y4mF 


Escherichia coli K12 MG1655 
yhbS 








Mycobacterium tuberculosis 
H37Rv Rv3024c 




Mycobacterium tuberculosis 
H37RV Rv3015c 


Streptomyces glaucescens tcmA 




Rhodothermus marinus dnlJ 
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glucose-resistance amylase 
regulator (catabolite control protein) 


ripose transport ATP-binding protein 


tiigh affinity ribose transport protein 


periplasmic ribose-binding protein 


high affinity ribose transport protein 


hypothetical protein 


iron-siderophore binding lipoprotein 


Na-dependent bile acid transporter 


RNA-dependent amidotransferase D 


putative F420-dependent NADH 
reductase 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 




dihydroxy-acid dehydratase 


hypothetical protein 
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MJ1501 f4re 


Escherichia coli K12 yqjG 
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hypothetical membrane protein 


hypothetical protein 




nitrate transport ATP-biading potein | 


maltose/mallodextrin transport ATP- 
binding protein 


nitrate transporter protein 






actinorhodin polyketide dimerase 


cobalt-zinc-cadlmium resistance 
protein 






hypothetical protein 




D-3-phosphoglycerate 
dehydrogenase 


hypothetical sertne-rich protein 






hypothetical protein 
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Corynebacterium glutamicum 
ATCC 13032 yilV 


Sulfolobus solfataricus 




Synechococcus sp. nrtD 


Enterobacter aerogenes 
(Aerobacter aerogenes) malK 


Anabaena sp. strain PCC 7120 
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Ralstonia eutropha czcD 
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Schizosaccharomyces pombe 
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Rhodobacter capsulatus strain 
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lipoprotein 


1 


glycogen phosphorylase | 






hypothetical protein 


hypothetical membrane protein | 




CO 

jr 

Q. 
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Q. 
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TO 2 

O) a 


acelate repressor protein | 


3-isopropylmaiate dehydratase large 
subunit 


3-isopropylmalate dehydratase small 
subunit 




mutator mutt protein ((7,8-dihydro- 
8-oxoguanine-triphosphatase)(8- 
oxo-dGTPase)(dGTP 
pyrophosphohydrolase) 




!NAD(P)H-dependent 
dihydro)cyacetone phosphate 


D-alanine-D-alanine ligase 
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Homologous gene 






Chlamydia trachomatis 




Rattus norvegicus (Rat) 






Bacillus subtilis yrkH 


Methanococcus jannaschii Y441 




Escherichia coli K12 spoT 


Escherichia coli K12 icIR 


Actinoplanes telchomyceticus 
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Salmonella typhimurium 




Mycobacterium tuberculosis 
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Bacillus subtilis gpdA 


Escherichia coli K12 MG1655 
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thiamin-phosphate kinase | 


uracil-DNA glycosylase precursor .| 


hypothetical protein 


ATP-dependent DNA helicase 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


biotin carboxyl carrier protein 


methylase 


ilpopolysaccharide core biosynthesis 
protein 




Neisseria! polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


ABC transporter or glutamine ABC 
transporter. ATP-binding protein 


nopaline transport protein 


glutamine-binding protein precursor 




hypothetical membrane protein 




phage integrase 
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hypothetical protein 
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cephamycin export protein 


DNA-blnding protein 


0) 
iO 
{0 

c 
<v 
Oi 

o 

■o 

>s 
JZ 

Q) 
TJ 

CO 

d) 
c 

g. 

O 

E 




Matched 
length 
(a.a) 












CD 
CN 




N- 
CO 




















to 

O) 

oo 


(0 
m 
«r 


cn 

CO 
CN 


*t 

00 
CN 




Similarity 
(%) 












96.2 




97.0 




















60.8 


67.8 


CO 


76.1 




Identity 
(%) 












88.5 




89.0 




















56.3 


33.8 


41.3 


46.5 1 




Homologous gene 
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Function 


hypothetical protein 










hemolysin 


hemolysin 




DEAD box RNA helicase 


ABC transporter ATP-binding protein 


6-phosphogluconate dehydrogenase | 


thioeslerase 




nodulation ATP-binding protein 1 


hypothetical membrane protein 


transcriptional regulator 


phosphonates transport system 
permease protein 


phosphonates transport system 
permease protein 


phosphonates transport ATP-binding 
protein 






15 


Matched 
length 

(a.a) 1 


00 










CN 
CO 


in 

CO 




CO 


in 

V 
CN 


CN* 

cn 
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ro 

CN 
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CO 
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CN 
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to 

CN 
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m 
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CO 
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CO 
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39.6 


43.1 
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29.9 
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44.8 
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Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv1828 










Bacillus subtilis yhdP 


Bacillus subtilis yhdT 




Thermus thermophilus herA 


Mycobacterium tuberculosis 
H37RV Rv1348 


Brevibacterium flavum 


Mycobacterium tuberculosis 
H37RV RV1847 




Rhizobium sp. N33 nodi 


Mycobacterium tuberculosis 
H37RV Rv1686c 


Escherichia coli K12 yfhH 


Escherichia coli K12 phnE 


Escherichia coli K12 phnE 


Escherichia coli K12 phnC 
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Function 




phosphomethylpyrimidine kinase 


hydoxyethylthiazole kinase 


cyclopropane-fatty-acyl-phospholipid 
synthase 


sugar transporter or 4-methyl-o- 
phthalate/phthalate permease 


purine phosphoribosyltransferase 


hypothetical protein 


ex. 
E 

CL 

C 

o 

c5 
o 
o 
v> 
c 

OJ 

li 

C 3 
(0 (/) 

" s 

c -g 
o c 
</) <1» 

CO E 




hypothetical protein 


sulfate permease 


hypothetical protein 










hypothetical protein 


dolichol phosphate mannose 
synthase 


apolipoprotein N-acyltransferase 




secretory lipase j 
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length 
(aa) 
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CN 
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CN 
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milarity 
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55.0 
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83.8 


83.6 
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87.3 


71.0 


55.6 




55.6 
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46.6 


28.6 
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36.6 


39.8 


23.3 




62.2 


51.8 


39.0 










71.8 


39.2 


25.1 




23.7 


Homologous gene 




Salmonella typhimurium thiD 


Salmonella typhimurium LT2 
thiM 


Mycobacterium tuberculosis 
H37RV ufaAl 


Burkholderia cepacia Pc701 
mopB 


Thermus flavus AT-62 gpt 


Escherichia coli K12yebN 


Sinorhizobium sp. A54 arsB 




Streptomyces coelicolor A3(2) 
SCI7.33 


Pseudomonas sp. R9 ORFA 


Pseudomonas sp. R9 ORFG 










Mycobacterium tuberculosis 
H37RV RV2050 


Schizosaccharomyces pombe 
dpml 


1 Escherichia coliK12lnt 




Candida albicans lipl 
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sp:TH!D_SALTY 
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1550398 
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1557014 


1557859 


1559497 
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1541403 
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1544976 


1547692 


1548440 


1548651 


1549403 
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1551545 
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1554684 


1554861 


1556079 


1555835 


1556376 
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Function 


precorrin 2 nfielhy {transferase 


precorrin-6Y C5. 15- 
methyltransferase 






oxidoreductase 


dipepttdase or X-Pro dipeplidase 




ATP-dependent RNA helicase 


sec-independent protein translocase 
protein 


hypothetical protein 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


hypothetical protein | 
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length 
(aa) 


CN 
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CO 




1030 


CO 
(O 
CN 
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80.3 
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36.1 
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28.7 




31.9 


32.4 


53.1 
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in 


48.6 


q 

CN 


Co Co ro 

Table 1 (continued) 


Honnologous gene 


Mycobacterium tuberculosis 
H37RV cobG 


Pseudomonas denitrificans 
SCSIOcobL 






Mycobacterium tuberculosis 
H37RV RV3412 


Streptococcus mutans LT1 1 
pepQ 




Saccharomyces cerevisiae 
YJL050W dob1 


Escherichia coli K12tatC 


Mycobacterium leprae 
MLCB2533.27 


Mycobacterium tuberculosis 
H37RV Rv2095c 


Mycobacterium leprae 
MLCB2533.25 


Mycobacterium tuberculosis 
H37RV Rv2097c 




Mycobacterium tuberculosis 
H37RV Rv2111c 


Mycobacterium tuberculosis 
H37RvRv2112c 


Aeropyrum pernix K1 APE2014 
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1574945 


1575406 
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Function 


AAA family ATPase (chaperone-like 
function) 


protein-beta-aspartate 
methyltransferase 


aspartyl aminopeptidase 


hypothetical protein 


virulence-associated protein 


quinolon resistance protein 


aspartate ammonia-lyase 


ATP phospharibosyltransferase 


beta-phosphoglucomutase 


5-methyltetrahydrofolate- 
homocysteine methyltransferase 




alkyi hydroperoxide reductase 
subunit F 


arsenical-resistance protein 


arsenate reductase 


arsenate reductase 




|cysteinyl-tRNA synthetase | 


Matched 
length 
(aa) 


in 
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CN 


to 
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to 
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milarity 
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CN 
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to 

CN 
CO 


CN 




1 35.9 


Homologous gene 


Rhodococcus erythropolis arc 


Mycobacterium leprae pimT 


Homo sapiens 


Mycobacterium tuberculosis 
H37Ry RV2119 


Dichelobacler nodosus A198 
vapl 


Staphylococcus aureus norA23 | 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
aspA 


Corynebacterium glutamicum 
AS019hisG 


Thermotoga maritima MSB8 
TM1254 


Escherichia coll K12 metH 




Xanlhomonas campestris ahpF ; 


Saccharomyces cerevistae 
S288C YPR201W acr3 


Staphylococcus aureus plasmid 
pl258 arsC 


Mycobacterium tuberculosis 
H37Rv arsC 




Escherichia coti K12 cysS 
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1591941 
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Function 


bacitracin resistance protein 


oxidoreductase 


lipoprotein 


dihydroorotate dehydrogenase | 






transposase 




bio operon ORF 1 (biotin biosynthetic 
enzyme) 


Neissefial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 




ABC transporter 




ABC transporter 




in 

2 
cu 
m 
c 

2 
% 

Z 

c 

>» 

E 
o 

=] 

Q. 


LAO(lysine. arginine. and 
ornithlne)/AO (arginine and 
ornithine)transport system kinase 


methylmalonyl-CoA mutase alpha 
subunit 
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length 
(a.a) 
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ro 

fO 
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Similarity 
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69.4 


62.6 


53.5 


67.1 






55.3 




75.0 


33.0 




68.7 




67.1 
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72.3 
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Identity 
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44.1 


26.0 




43.6 
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CO 
CO 




32.4 


43.1 


72.2 


25 
30 
35 


Table 1 (conlinued) 


Homologous gene 


Escherichia coli K12 bacA 


Agrobacterlum tumefaciens 
mocA 


Mycobacterium tubercutosis 
H37RV IppL 


Agrocybe aegerita ura1 






Pseudomonas syringae tnpA 




Escherichia coli K12 ybhB 


Neisseria meningitidis 




Corynebacterium striatum M82B 
tetB 




Corynebacterium striatum M82B 
tetA 




Streptomyces anulatus pac | 


Escherichia coli K12 argK 


Streptomyces cinnamonensis 
A3823.5mutB 


40 




db Match 


sp:BACA_ECOU ! 


prf:2214302F 


pir:F70577 


spPYRD^GRAE 1 
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1611150 


1612234 
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1598623 
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1599679 


1600692 
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1605315 


1605811 
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1607645 


1607657 
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1609247 
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! 1610236 


1612238 


CO 
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00 
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Function 


methylmalonyl-CoA mutase beta 
subunit 


hypothetical membrane protein . 




hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 




ferrochelatase 


invastn 




aconitate hydratase 


transcriptional regulator 


GMP synthetase 


hypothetical protein 


hypothetical protein 




hypothetical protein 


Matched 
length 
(aa) 


o 
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(O 


CN 
CM 




o 

CO 


r— 


CO 
CN 




CO 

cn 


CO 




m 




ID 
CO 
CN 


M 


CO 
CO 




CO 


milarity 


68.2 


70.1 




87.0 


78.7 


72.8 




65.7 


56.5 




85.9 ' 


81.6 


51.9 


62.0 


80.2 




86.1 






































Identity 
(%) 




39.7 




64.1 


44.7 


51.0 




36.8 


25.5 




669 


54.6 


21.3 


32.6 


37.2 




61.2 


Homologous gene 


Streptomyces cinnamonensis 
A3823.5 mutA 


Mycobacterium tuberculosis 
H37RV Rv1491c 




Mycobacterium tuberculosis 
H37RV RV1488 


Mycobacterium tuberculosis 
H37RV Rv1487 


Streptomyces coelicolor A3(2) 
SCC77.24 




Propionibacterium freucienreichll 
subsp. Shermanii hemH 


Streptococcus faecium 




Mycobacterium tuberculosis 
H37RV acn 


Mycobacterium tuberculosis 
H37RV Rv1474c 


Methanococcus jannaschii 
MJ1575 guaA 


Streptomyces coelicolor A3(2) 
SCD82.04C 


Methanococcus jannaschii 
MJ1558 




Neisseria meningitidis MC58 
NMB1652 
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ro 
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ID 
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1619672 


1620167 
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1621841 


1623027 


1625428 


1629107 


1629861 


1630668 




1630667 


1631926 


1631353 


1633324 


Initial 
(nt) 


1616298 


1616578 


1617398 


1619616 


1620106 




1621009 


1621056 


1622950 


1624826 


1625925 


1626279 
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Function 

1 


antigenic protein 


antigenic protein 


cation-transporting ATPase P 




hypothetical protein 










host cell surface-exposed lipoprotein 


integrase. 


ABC transporter ATP-binding protein 




sialidase 


transposase (IS1628) 


transposase protein fragment 


hypothetical protein 




dTDP-4-keto-L-rhamnose reductase 


nitrogen fixation protein 
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length 
(aa) 
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CN 
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CO 
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CN 


cn 


CO 
CO 




o 


CJ) 


Similarity 
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69.0 
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72.4 
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72.0 
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Identity 
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in 


59.0 
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35.8 










43.0 


34,4 


32.8 




51.9 


99.6 


64.0 


32.0 




32.7 


63.8 


Homologous gene 


Neisseria gonorrhoeae ORF24 


Neisseria gonorrhoeae 


Synechocystis sp. PCC6803 
s(l1614 pmal 




Streptomyces coelicolor A3{2} 
SC3D1 1.02c 










Streptococcus thermophilus 
phage TP-J34 


Corynephage 304L int 


Escherichia coli K12 yjjK 




Micromonospora viridifaciens 
ATCC31146nedA 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 


Corynebacterium glutamicum 
TnpNC 


Plasmid NTP16 




Pyrococcus abyssi Orsay 
PAB1087 


Mycobacterium leprae 
MLCL536.24C niRJ? 
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1642743 
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1637081 
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1639365 


1639656 


1639781 


1640546 


1642674' 


1644218 


1645499 


1645661 


1645821 


1645861 ' 


1646549 


XT 
CO 
CD 

xr 

to 


r-. 
cn 
o 
oo 

CO 
f— 


c/) 2 e 


5206 


52071 


5208 


5209 


5210 


5211 


5212 


5213 


5214 


5215 


5216 


5217 


521B 


5219 


5220 


5221 


5222 


5223 


- 

5224 


5225 




S 


Q 


1708 


o> 
o 


o 




CN 


CO 

1 


^ 


in 


1716 




CO 


CJ) 


o 

CN 


r- 

CN 


1722 


CO 
CN 


1724 


1725 



126 



8/17/2007, EAST Version: 2.1.0.14 



EP 1 108 790 A2 



5 
10 


Function 


hypothetical protein | 


nilrogen fixation protein 


ABC transporter ATP-binding protein 


hypothetical protein 


1 

ABC transporter 


DNA-binding protein 


hypothetical membrane protein 


ABC transporter 


hypothetical protein 


hypothetical protein 




helicase 


quinone oxidoreductase 


cytochrome o ubiquinol oxidase 
assembly factor / heme O 
synthase . 


transketolase 


0) 
CO 
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•o 

CO 
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c 

CO 
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ro 
o> 


(N 


CO 


fO 


CD 
CO 
CN 


CN 


— r 


CO 


CO 
CN 
ro 


to 
cn 

CN 


in 

CD 


00 

in 

CO 




20 


Similarity 
(%) 
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84.4 


89.3 


83.0 

- 


73.0 


71.4 


67.8 


77.3 


74.8 


74.6 




1 51.0 


70.9 


00 
CD 
CO 


, 100.0 


85.2 


- 






48.0 


64.7 


70.2 


55.2 


41.0 


46.1 


36.3 


50.2 


41.0 


43.0 ' 




23.4 


37.5 


37.6 


100.0 


62.0 




25 

C 

c 
o 
u 

0} 

35 


Homologous gene 


Aeropyrum pernix K1 APE2025 | 


Mycobacterium leprae nifS j 


Streptomyces coelicolor A3(2) 
SCC22.04C 


Mycobacterium tuberculosis 
H37RV RV1462 


Synechocystis sp. PCC6803 
slrO074 


Streptomyces coelicolor A3(2) 
SCC22.08C 


Mycobacterium tuberculosis 
H37RV Rv1459c 


Mycobacterium leprae 
MLCL536.31 abc2 


Mycobacterium leprae 
MLCL536.32 1 


Mycobacterium tuberculosis 
H37RV Rvl456c 




Pyrococcus horikoshii PH0450 


Escherichia coli K12 qor 


Nitrobacter winogradskyi coxC 


Corynebacterium glutamicum 
ATCC 31833 tkt 


Mycobactei^ium leprae 
MLCL536.39 tal 




db Match 
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'\ 
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AK 
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1648709 


1648100 


1649367 


1650249 


1651433 


1652894 


1655671 


1 1656700 


1657515 


1658675 

1 


1659140 


1661136 


1662552 


1662630 


1666502 


1667752 


IQ9999L 


50 


Initial 
(nt) 


1648548 


1649362 


1650122 


1651424 


1652875 


1653586 


1654043 


1 
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1656712 


1557677 


1659496 


1659508 


1661578 


1663598 


1664403 


1666673 


CO 
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Function 


glucose-6-phosphate 
dehydrogenase 


oxppcycle protein (glucose 6- 
phosphate dehydrogenase 
assembly protein) 


6-phosphogluconolactonase 


sarcosine oxidase 


transposase(IS1676) 


sarcosine oxidase 








triose-phosphate isomerase 


probable membrane protein 


1 phosphoglycerate kinase 


glyceraldehyde-3-phosphate 
dehydrogenase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


excinuclease ABC subunit C 




Matched 
length 
(aa) 


00 

^ 


CO 
CO 


00 

in 

CM 


00 
CM 


o 
o 
in 
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o 
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O) 

in 
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CO 
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0 
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n 
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CO 


C7) 


00 
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Similarity 
(%) 


100.0 


71.7 


58.1 


57.8 


46.6 


100.0 

1 








(O 

ai 


51.0 


9B.5 


99.7 


87.4 1 


82.6 


76.2 


61.5 


Identity 
{%) 


99.8 


40.6 


28.7 


35.2 


24.6 


100.0 








rs 
ai 

Oi 


37.0 


98.0 


99.1 


63.9 


56.3 


52.0 


34.4 


Homologous gene 


Brevibacterium flavum 


Mycobacterium tuberculosis 
H37RV Rv1446copcA 


Saccharomyces cerevisiae 
S288C YHR163WS013 


Bacillus sp. NS-129 


Rhodococcus erythropoUs 


Corynebacterium glutamicum 
ATCC 13032 soxA 








Corynebacterium glutamicum 
AS019 ATCC 13059 tpiA 


Saccharomyces cerevisiae 
YCR013C 


Corynebacterium glutamicum 
AS019 ATCC 13059 pgk 


Corynebacterium glutamicum 
AS019 ATCC 13059 gap 


Mycobacterium tuberculosis 
H37Rv Rv1423 


Mycobacterium tuberculosis 
H37RvRv1422 


Mycobacterium tuberculosis 
H37RvRv1421 


Synechocystis sp. PCC6803 
uvrC 


db Match 


CM 
(D 

CL 
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o 
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z 
w 
o 
< 

s 

{/) 

CL 

CO 


^1 

00 
CM 
CO 
CM 

< 
CL 
O) 


^1 

CM 
C^ 

Q 
O 
_l 

O 

o 

o. 

O) 








sp;TPIS_CORGL 


< 
LU 
> 

U 
>- 

oi 
w 


_j 
0 
cn 
0 
(J 

0 

Q. 

0. 

CO 


sp:G3P_CORGL 


CO 

0 

O) 

Q 
'a. 


3 
> 
0' 
> 

CL 
W1 


sp:YR39_MYCTU 


-J 

LJL 
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00 
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cn 
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0 


1215 
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CO 
CD 


1023 


CNI 

on 


2088 




Terminal 
(nt) 


1669401 


1670375 


1671099 


1671273 


1673123 


1673266 


1677384 


1678070 


1680128 1 


1680332 


1681670 


1681190 1 


1682624 


00 

(O 


1685110 


1686152 


1687103 




Initia} 
(nt) 


1657950 


cn 

XT 

o> 


1670395 


1671677 


1671723 


1674105 


1677211 


1678756 


1679148 


1681 108 


1681263 


1682404 
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1685097 


1686132 


1687078 


1689190 
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m 

CM 
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15249 
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hypothetical protein 


0) 

c 

N 
(0 

E 

">» 

•9 

00 

E S 

CO V) 


polypeptide encoded by nb operon 


riboflavin biosynthetic protein 


polypeptide encoded by nb operon 


GTP cydohydrolase II and 3. 4- 
dihydroxy-2-butanone 4-phosphate 
synthase (riboflavin synthesis) 


riboflavin synthase alpha chain 


riboflavin-specific deaminase 


ribulose-phosphate 3-eplmerase 


nucleolar protein NOL1/NOP2 
(eukaryotes) family 


methionyl-tRNA formyltransferase 


polypeptide defonrjylase 


primosoma! protein n 


,S-adenosylmethionine synthetase 


DNA/pantothenate metabolism 
flavoprotein 


hypothetical protein 


1 guanylate kinase 


integration host factor 


Matched 
length 
(aa) 
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in 


CN 


CN 


s 
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CN 


lO 


CO 
CN 


00 
TT 
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o 

CO 


o 
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in 

CM 
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CO 


CO 
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Similarity 
(%) 


68.7 


72.1 


68.0 


O 

t6 
■V 


52.0 


84.7 


79.2 


h- 

CN 
<D 


73.1 


60.7 


67.9 
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46.3 


99.5 


80.9 


87.7 


74.7 
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Identity 
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43.5 


59.0 1 


o 

ID 
CN 


44.0 j 


65.6 
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37.3 


43.6 
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CD 


r^ 


O) 

CM 
CN 


99.3 


58.0 


70.4 


39.8 


80.6 


Homologous gene 


Mycobacterium tuberculosis 
H37RV RV1417 


Escherichia coli K12 


Bacillus subtiiis 


Bacillus subtiiis | 


Bacillus subtiiis 


Mycobacterium tuberculosis ribA 


Actinobaciilus 

pleuropneumoniae lSU-178 ribE 


Escherichia coli K12 ribD 


Saccharomyces cerevisiae 
S2B8C YJL121Crpe1 


Escherichia coli K12 sun 


Pseudomonas aeruginosa tmt 


Bacillus subtiiis 168 def 


Escherichia coli priA 


Brevibacterium flavum MJ-233 


Mycobacterium tuberculosis 
H37RV RV1391 dfp 


Mycobacterium tuberculosis 
H37RV Rv1390 


Saccharomyces cerevisiae guki 


Mycobacterium tuberculosis 
H37RV Rv1388 mlHF 




H 
O 
> 

CO 

> 

id 
tn 


S 

o 

UJ 
CO 

a: 

d 
V) 


GSP:YB3273 


(N 

CN 
CO 
GO 

>- 

CL 

to 


GSP:Y83273 


gp;AF001929_1 


_j 

CL 

O 

i 

(t 

a. 
irt 


sp:RIBD ECOLI 


< 

UJ 

^' 

Q. 

q: 

1 ^ 


_i 
O 

o 

UJ 

2' 

D 

(/) 
d 


Ui 

< 

UJ 
CO 
CL 
yj 

u. 

d 


sp:DEF_BACSU 


O 
o 

UJ 

tr 

Q- 

d 
in 


lgsp:R80060 


D 
1- 
O 
>- 

CL 
U- 
D 
d 
v> 


o 
> 

o' 
o> 
Q 
> 
a. 

V} 


pi.rKIBYGU 


pir.B70899 


O ii- 


O) 
r-- 
in 




00 
CN 
CN 




CD 
CO 

<n 


1266 


CO 
CO 
(O 


00 
O) 


in 
CO 


1332 


m 

0) 


o 
in 


2064 


,1221 


1260 


o> 

CN 


CN 

to 


CO 

ro 


Terminal 
(nt) 


1689201 


1689869 


1690921 
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1702991 
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1690708 


1691012 


1691625 


1692271 
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1695298 
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Q> 
TO 

x: 

CL 
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P 

O T5 


carbamoyi-phosphate synthase 
large chain 


carbamoyi-phosphate synthase 
small chain 


<D 
in 

i2 
o 

o 

TJ 

x: 


aspartate carbamoyltransferase 


phosphoribosyl transferase or 
pyrimidine operon regulatory protein 


cell division inhibitor 








N utilization substance proiein B 
(regulation of rRNA biosynthesis by 
transcriptional antitermination) 


elongation factor P 


cytoplasmic peptidase 


Q> 
lA 
CQ 
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V) 
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cr 
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t 
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Ishfkrmate kinase 


Itype IV prepilin-like protein specific 
leader peptidase 
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length 
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t~ 
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00 
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Similarity 
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77.5 


70.1 


67.7 


79.7 
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99.7 
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Table 1 (continued) 


Homologous gene 
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o a: 
s r 


Fscherlchia coli carB 


Pseudomonas aeruginosa 
ATCC 15692 carA 


Bacillus caldolyticus DSM 405 
pyrC 


Pseudomonas aeruginosa 
ATCC 15692 


Bacillus caldolyticus DSM 405 
pyrR 


Mycobacterium tuberculosis 
H37RV Rv22l6 








Bacillus subti/is nusB 


B 
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C CO 
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! Corynebacterium glutamicum 
|AS019pepQ 


Corynebacterium glutamicum 
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Corynebacterium glutamicum 
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length 
(aa) 
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o 
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o 

CO 


i 

5 

to 


Similarity 
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81,0 


68.2 


80.2 


98.6 


51.4 




80.8 


59.1 


85.5 


61.2 
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99.6 


64.0 


99.1 




79.0 


50.7 


Identity 
(%) 


58.0 


38,4 


54.4 


98.0 


1 23.9 




61.3 


32.3 i 


65.8 j 


33.5 


97.2 


98.7 


62.0 


99.1 




ro 
in 


CN 


Homologous gene 


Mycobacterium tuberculosis 
H37RvRv2699c 


Escherichia coli K12 suhB 


Mycobacterium tuberculosis 
H37RV RV2702 ppgK 


Corynebacterium glutamicum 
sigA 


Bacillus subtilis yrkO 




Mycobacterium tuberculosis 
H37RV Rv29l7 


Mycobacterium tuberculosis 
H37RV RV2709 


Mycobacterium tuberculosis 
H37RV Rv2708c 


Streptomyces coelicolor A3(2) 
SCH5.08C 


Corynebacterium glutamicum 
ATCC 13869 OR F1 


Corynebacterium glutamicum 
ATCC 13869 dbcR 


Streptomyces aureofaciens 


Corynebacterium glutamicum 
ATCC 13869 (Brevibacterium 
lactofermentum) galE 




Mycobacterium tuberculosis 
H37RvRv2714 


Saccharomyces cerevisiae 
YJL050W dob1 
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—1 
O 
o 
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q: 
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CN 
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CO 
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to 
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to 


n 

CN 
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CM 
CO 


00 
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to 

CN 
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Q) 


1323 
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cn 


2550 
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2013356 i 


CM 
(D 

NT 

O 
CN 
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2017966' 2016257 


2018754 
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2020276 


2020724 


2022949 


2022313 


2023945 


2023948 


2026379 


2029043 


Initial 
(nt) 


2009570 


2010539 


2010555 


2011863 


2015496 1 


2016121 


2018119 


2018202 


00 

o 

CN 


2020293 


2022266 


2022546 


2022959 


2025270 1 


2025423 


2026494 
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NO. 
(a.a.) 


5597 


5598 1 
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5603 
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9099 
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5609 
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5613 


SEQ 
NO. 
(DNA) 
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1 2101 
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1 2106 
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2110 
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Function 


hydrogen peroxide-inducibie genes 
activator 




1 ATP-dependent helicase | 


regulatory protein 




SOS regulatory protein | 


galaditol utilization operon repressor | 


phosphofructokinase (fructose 1- 
phosphate kinase) 


phosphoenolpyruvate-protein 
phosphotransferase 


glycero!-3-phosphate regulon 
repressor 


1 -phosphofructokinase or6- 
phosphofructokinase 


PTS system, fructose-specific IIBC 
component 


phosphocarrier protein 




uracil permease | 


ATP/GTP- binding protein 






diamtnopimelate epimerase 
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length 
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CO 
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Homologous gene 


Escherichia colt oxyR 




1 Escherichia coli hrpA 


Streptomyces clavuligerus nrdR 




Bacillus subtilis dinR 


Escherichia coli K12 gatR 


Streptomyces coelicolor A3(2) 
SCE22.14C 


Bacillus stearothermophilus ptsl 


Escherichia coli K12glpR 


Rhodobacter capsulatus fruK 


Escherichia coli K12fruA 


Bacillus stearothermophilus XL- 
65-6 ptsH 




Bacillus caldolytlcus pyrP 


Streptomyces fradiae orf 1 1* j 






Haemophilus influenzae Rd 
KW20 HI0750 dapF 
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5 
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Function 


hypothetical protein 


hypothetical protein (35kD protein) 


regulator (DNA-binding protein) 


competence damage induced 
proteins 


phosphotidylglycerophosphate 
synthase 


hypothetical protein 


surface protein (Peumococcal 
surface protein A) 




tellurite resistance protein 


stage III sporulation protein E 


hypothetical protein 


hypothetical protein 


hypothetical protein 






guanosine pentaphosphate 
synthetase 


30S ribosomal protein S15 


nucleoside hydrolase | 


15 


Matched 
length 
(aa) 


00 
CN 
CN 


a> 

CO 
CN 


CO 
CO 


in 

CO 


O 
CO 

T- 




o 

CO 




CO 

m 

CO 


in 

CO 


CO 

T- 


m 

s 


o 
m 

CN 






CN • 


O) 
00 


co 


20 


Similarity 
(%) 


78.5 


89.6 


78.3 


68.5 


72.5 


52.1 


70.0 




59.8 


64.6 1 


61.0 


99.4 


99.6 






85.3 


88.8 


63.3 




Identity 
(%) 


41.7 


72.5 


54.2 


41.8 


38.8 


24.8 


60.0 




31.0 


38.0 


33.3 


99.1 


99.2 






65.4 


64.0 1 


T- 

iri 

CO 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 


Mycobacterium tuberculosis 
H37RV RV2744C 


Mycobacterium tuberculosis 
H37RV Rv2745.c 


Streptococcus pneumoniae R6X 
cinA 


Streptococcus pyogenes pgsA 


Arabidopsis thatiana 
ATSP:T16I18.20 


Streptococcus pneumoniae 
DBL5 pspA 




Escherichia coli terC 


Bacillus subtilis 168 spoHIE 


Streptomyces coelicolor A3(2) 
SC4G6.14 


Corynebacterium glutamicum 
ATCC 13032 orf4 


Corynebacterium glutamicum 
(Brevibacterium lactofenmentum) 
ATCC 13869 orf2 






Streptomyces antibioticus gpsI 


Bacillus subtilis rpsO 


Leishmania major 


40 
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cn 

(0 
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cn 

CO 


s 

CN 
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to 

CM 


00 

a 
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Terminal 
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2068556 


2069616 


2069997 


2070519 


2071599 ; 


2071740 1 
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2073294 


2076392 1 
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1 

2080387 


2082813 


2082105 
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to 

CO 
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o 

CN 


2085879 


50 
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(nt) 
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2071624 


2072066^ 
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o 

CN 


2079275 


2061136 


2082115 1 
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CO 2 e 
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CO 
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CO 

in 
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5656 


5657 


5658 


5659 


5660 


5661 


5662 


5663 


5664 


5665; 


9995 


5667 


55 


SEQ 
NO. 
(DNA) 
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U,s, 


2152 


2153 


2154 


in 
\n 

fN 
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2157 


2158 


2159 


2160 


2161 


2162 


2163 


2164 


2165 


2166 j 


2167 
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Function 


bifunctional protein (riboflavin kinase 
and FAD synthetase) 


tRNA pseudouridine synthase B 


hypothetical protein 


hypothetical protein 


phosphoesterase 


DNA damaged inducible protein f 


hypothetical protein 


ribosome-binding factor A 


translation initiation fac:tor IF-2 


hypothetical protein 


n-utilization substance protein 
(transcriptional 

termination/antitermination factor) 




hypothetical protein 


peptide-binding protein 


peptidetransport system permease | 


oligopeptide permease 


peptidetransport system ABC- 
transporter ATP-binding protein 


15 




Matched 
length 

(a.a) 
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CO 
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CO 
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CO 
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CO 
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CO 


GO 

o 


1103 


CO 
00 


(N 

in 

CO 




in 

CD 
r— 


CO 

m 
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CO 


CN 

cn 

CN 
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in 
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Similarity 
(%) 


79.0 


61.7 


73.0 


62.5 


68.9 


78.8 


70.8 


70.4 


62.9 1 


66.3 ' 


71.0 




65.5 


60.9 


ai 

CD 


69.2 


81.3 






Identity 
(%) 


66.2 


32.7 


65.0 


42.2 


46.9 


51.0 


36.7 


32.4 


37.7 


44.6 


CO 

r>i 




34.6 


25.3 


37.7 


38.4 1 


57.6 


25 
30 


Table 1 (continued) 


Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6872 ribF 


Bacillus subtilis 168 truB | 


Corynebacterium 
ammoniagenes 


Streptomyces coelicolor A3(2) 
SC5A7.23 


Mycobaderium tuberculosis 
H37RV Rv2795c 


Mycobacterium tuberculosis 
H37RV Rv2836c dinF 


Mycobacterium tuberculosis 
H37RV R\/2837c 


Bacillus subtilis 168 rbfA 


Stigmatella aurantiaca DW4 infB 


Streptomyces coelicolor A3(2) 
SC5H4.29 


Bacillus subtilis 168 nusA 




Mycobacterium tuberculosis 
H37RV RV2842C 


Bacillus subtilis 168 dppE 


Escherichia coti K12dppB 


Bacillus subtilis spoOKC 


Mycobacterium tuberculosis 
H37RV Rv3663c dppD 


35 
40 
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2088863 i 


IT 
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00 
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1 

2089218 i 


2089861 


2090751 


2092051 


2093055 


2093712 


2096844 


2097380 


2099815 


2098412 


2101841 


2102946 1 


2103973 


2105703 


50 




Initial 
(nt) 


2087941 


2087973 


2088181 


2089868 


2090664 


2092055 


2093046 


2093501 j 


2096723 


2097179 


2098375 


2098562 1 


2098945 


2100240 


2102023 


2102975 


2103973 






SEQ 
NO. 
(a.a.) 


5668 


5669 
5670 


5671 


5672 


5673 


5674 


5675 


5676 


5677 


5678 


5679 


5680 
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5682 


5683 
5684 
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SEQ 
NO. 
(DNA) 
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2169 
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2171 


2172 


2173 


2174 
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2176 


2177 


2178 


2179 


2180 


2181' 


2182 


2183 


2184 
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Function 


prolyl-tRNA synthetase 


hypothetical protein 


magnesium-chelatase subuntt 


magnesium-chelatase subunrt 


uroporphyrinogen III 
methyltransf erase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


glutathione reductase 










methionine aminopeptidase 


penicillin binding protein 


response regulator (two-component 
system response regulator) 


two-component system sensor 
histidine kinase 


hypothetical membrane protein 




Matched 
length 
(a.a) 
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Similarity 
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69.6 


73.8 


68.7 


62.3 
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56.5 


OJ 
CN 
N. 


56.8 


58.1 




Identity 
(%) 


67.0 


39.5 


32.4 


46.5 


49.0 


41.2 


35.1 


37.6 


53.0 










47.2 


27.3 


44.0 


29.5 


24.4 : 




Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv2845c pros 


Streptomyces coelicolor A3(2) 
SCC30.05 


Rhodobacter sphaeroides ATCC 
17023 bchD 


Heliobacillus mobilis bchI 


Propionibacterium freudenreichli 
cobA 


Clostridium perfringens NCIB 
10662 ORF2 
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Mycobacterium tuberculosis 
H37RV Rv2854 


Burkholderia cepacia AC1 100 
gor 










Escherichia coli K12 map 


Streptomyces clavuligerus pcbR 


Corynebacterium diphtheriae 
chrA 


Corynebacterium diphtheriae 
ChrS 


Deinococcus radiodurans 
DRA0279 
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5591 


CN 

o> 

<D 

in 


5693 


5694 


5695 


5696! 


5697! 


569? 


5699: 


5700 


5701 


5702' 


SEQ 
NO. 
(DNA) 


2185 


2186 


2187 


2188 


2189 


2190 


2191 


2192 


2193 


2194 


12 195' 


2196 


2197 


2198 


2199 


2200 


2201 


s i 

CN 1 
CN j 
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c 

8 



0) 



Function 


ABC transporter 




hypothetical protein (gcpE protein) 




hypothetical membrane protein 


polypeptides can be used as 
vaccines against Chlamydia 
trachomatis 


1-deoxy-D-xylulose-5-phosphate 
reductoisomerase 








ABC transporter ATP-binding protein 


pyruvate formate-lyase 1 activating 
enzyme 


hypothetical membrane protein 


phosphatidate cytidylyltransferase 


ribosome recycling factor 


uridylate kinase 




elongation factor Ts 


30S ribosomal protein 32 


Matched 
iength 
(aa) 


to 

OJ 
CNI 




CD 

tn 
n 




in 
0 
tr 




CM 
1— 

CO 








in 


CD 

in 

CO 




(N 


vn 
00 

T- 


0 




0 

CO 
CM 


in 

CN 


Similarity 
(%) 


71.1 




73.8 




73.6 


43.0 


42-0 








75.1 i 


78 0 


74.5 


56.5 


84.3 


43.1 . 




76.8 


83.5 


Identity 
(%) 


37.3 




44.3 




43.0 


36.0 


22.8 








37.1 


66.0 


41.5 


33.3 


47.0 


28.4 




49.6 


to 


Homologous gene 


Bacillus subtilis 168 yvrO 




Escherichia coli K12 gcpE 




Mycobacterium tuberculosis 
H37RV Rv2869c 


Chlamydia trachomatis 


Escherichia coli K12 dxr 








Thermotoga maritima MSB8 
TM0793 


Mycobacterium tuberculosis 
H37RV 


Mycobacterium tuberculosis 
H37RV Rv3760 


Pseudomonas aeruginosa 
ATCC 15692 cdsA 


Bacillus subtilis 168 frr 


Pseudomonas aeruginosa pyrH 




Streptomyces coelicolor A3(2) 
SC2E1.42 tsf 


Bacillus subtilis rpsB 


db Match 


Q. 
O 

rr 

o 

CN 
CN 

a. 


] 


sp:GCPE_ECOLI 




pir:G7C886 


GSP:Y37145 


sp:DXR_ECOLI 








pir:B72334 


3 
1- 
0 
>- 
5 
0 

CO 

(/) 
> 


pirA70801 


sp:CDSA_PSEAE 


sp:RRF_BACSU 


prf:2510355C . 




0 
0 
cr 

(0 

1 

CO 

u_ 

UJ 

icL 
in 


pir:A69699 


ORF 

(Dp) 


o 
cn 

CO 


!CN 

to 




CN 
CO 


CN 
CN 




in 


1176, 




0 

CO 


1578, 


U) 

in 

CO 


1098 

1 


00 
in 

CN 


to 
to 
00 


in 
in 
in 


C3> 
CN 


00 


in 

CN 
CO 


to 

GO 


Terminal 
(nt) 


2126753 


2126926 


2127350 


2129461 


2128669 


2130950 


2129903 


2131762 


2131247 


2131825 


2133406 


in 

CO 
i CN 

1 


2136141 


2136235 


2137286 


2137936 


2139854 


2139003 


2140071 


initial 
(nt) 


26064 


27087 


28483 


28850 


29880 




30306 


131078 


131322 


131726 


133402 


134260 


135551 


35884 


37089 


1378401 


38664 


a) 

CO 
CO 


39827 


40886 1 




CN 


CM 




ICN 


CN 




CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


1 CN 


SEQ 
NO. 
(a.a.) 


5703 


5704 


5705 


5706 


5707 


5708 


5709 


5710 


in 


5712 


5713 


1 ^ 

1 r*- 
1 ^ 


5715 


5716 


5717 


5718 


5719 


5720 


5721 


SEC 
NO. 
, (ONA) 


I 2203 


o 

CN 
CN 


i 2205' 


2206 


2207 


220B 


2209 


2210 


2211 


2212 


2213 


! ^ 

; « 


2215 


2216 


2217 


2218 


2219 


2220 


2221 
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5 
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Function 


hypothetical protein 


site-specific recombinase 


hypothetical protein 


Mg(2+) chelatase family protein 


hypothetical protein 


hypotheti<:al protein 


ribonuclease HII 




signal peptidase 


Fe-regulated protein 




SOS ribosomal protein LI 9 


thiamine phosphate 
pyrophosphorylase 


oxidoreductase 


thiamine biosynthetic enzyme thiS 
(thIGI) protein 


thiamine biosynthetic enzyme thlG 
protein 


molybdopterin biosynthesis protein | 


IS 


Matched 
length 
(aa) 


o 

CN 


cn 

CN 


in 

CO 


5 

m 




o 


o 

O) 




in 

GO 
CN 


CO 
CN 
CO 




T— 

X— 


m 

CN 
CN 


CO 
CO 


CN 
CO 


m 

CN 


CO 




>> 




































20 


Similar! 
(%) 


58.0 


68.7 


66.8 


75.8 


72.3 


96.0 


69.5 




61.1 


59,1 




88.3 


60.9 


64.1 


74.2 


76.9 


56.8 




Identrty 
<%) 


o 

CO 


d 

N- 


39.8 


(0 

cc 


CO 

d 


68.3 


42.6 




32.3 


25.4 




70.3 


28.4 


c> 
^* 

CO 


37.1 


48.2 


30.2 


25 

C 

c 

8 

30 ^ 

0) 

n 

35 


Homologous gene 


Mycobacterium tuberculosis 
H37RvR\/2891 


Proteus mirabilis xerD 


Mycobacterium tuberculosis 
H37RV Rv2896c 


Mycobacterium tuberculosis 
H37RV Rv2897c 


Mycobacterium tuberculosis 
H37RV Rv2898c 


Mycobacterium tuberculosis 
H37Rv Rv2901c 


Haemophilus influenzae Rd 
HI1059rnhB 




Streptomyces livldans TK21 
sipY 


Staphylococcus aureus sir A 




Bacillus stearothermophilus rp!S 


Bacillus subtilis 168 thiE 


streptomyces coelicolor A3{2) 
SC6E10.01 


Escherichia coli K12 this 


Escherichia coll K12thiG 


Emericella nidulans cnxF 


40 


db Match 


sp:YS91_MYCTU 


prf;2417318A 


D 
h- 
O 
> 

1 

CN 

1 ^ 

CL 
I/) 


D 
1- 
O 
> 

> 

CL 
V) 


sp:YX29_MYCTU 


\- 
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5 

o 

> 
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2: 

LU 

X 
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a: 
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X 
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QO 
CN 

in 

CN 
t 
ZL 
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c' 

T- 

—1 

cr 

CL 
(A 


sp:THIE_BACSU 


gp:SC6E10_1 


_i 
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0 

^. 
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X 

b. 


sp:THIG_ECOU 


CO 
CO 

NT 
CN 

Q. 




u 


o 
in 


CN 
O) 


1182 


1521 


CO 
CD 
CO 


CO 

o 

CO 
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CN 
CD 


CN 

cn 


CD 
00 


CD 
CO 
O) 


CO 
CN 


C7> 
CO 
CO 


CO 
CD 
CD 


1080 


m 


0 
00 


■«T 
CO 


45 


Terminal 
(nt) 


2141760 


2141763 


2142885 


2144066 


2145576 


2146264 


2146566 


2148022 


2147261 


2149166 


2149359 


rr 

CO 
CD 

o> 

CN 


2150997 


2152118 


2152329 


2153113 


2154191 


50 


nitial 
(nt) 


41257 


42686 j 


CO 
CO 

o 


45586 


45941 


46566 


147192 


147231 1 


CD 
"q- 
O 
00 


48231 


149571 


149972 


150335' 


151039 


52135 


152334 


53058 






CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 1 CN 


CNJ 


CN 


i ^ 




CN 




, w ^ 2. 


5722 


5723 


CN 

in 


5725 


5726 


5727 


5728 


5729 


5730 


5731 


5732 


5733I 


CO 
in 


5735 


5736 


5737 


5738 


55 


SEQ 
NO. 
(DNA) 


2222 


2223 


2224 


2225 


2226 


2227 


2228 


2229 


2230 


2231 


2232 


2233 


2234 


2235 


2236 


2237 


2238 



152 



8/17/2007, EAST Version: Z. 1.0. 14 



EP1 108 790 A2 



5 
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15 



c 



35 



40 



45 



50 



55 



Function 


transcriptional accessory protein 


sporulation-specific degradation 
regulator protein 


dicarboxyiase translocator 


o 
to 
u 
o 
(/) 
c 

2 
B 
ro 

S 
ro 

o 

X 

o 

CM 


3-carboxy-cis.cis-muconate 
cycloisomerase 








tRNA (guanine-N1)- 
methyltransferase 


hypothetical protein 


16S rRNA processing protein 


hypothetical protein | 


30S ribosomal protein S16 


inversin 


ABC transporter | 


ABC transporter 


signal recognition particle protein 








cell division protein 


Matched 
length 
(aa) 


CO 

r- 


CO 
CO 


to 
in 


in 

(O 


o 
in 

00 








ro 

CN 


o 

CM 


CN 


(D 


CO 

CO 


to 
- 


in 

CN 


oo 

CO 


Ol 

in 
m 








m 
o 
m 


Similarity 
(%) 


78.7 


65.3 


78.3 


o 

CD 

CO 


66.3 








64.8 


57.6 


72.1 


66.7 


79.5 


61.7 


69.1 


63.8 j 


(N 
CO 








66.1 


Identity 
(%) 


56.6 


• 27.0 


45.8 


40.0 1 


39.1 








34.8 


30.5 


52.3 


29.0 


47.0 


32.1 


26.6 1 


35.5 1 


58.7 








37.0 


Homologous gene 


Bordetella pertussis TOHAMA 1 
tex 


Bacillus subtilis 168 degA 


Chlamydophila pneumoniae 
CWL029 ybhl I 


Spinacia oleracea chloroplast 


Pseudomonas putida pcaB 








Escherichia coli K12trmD 


Streptomyces coelicolor A3(2) 
SCF81.27 


Mycobacterium leprae 
MLCB250.34. rimM 


Helicobacter pylori J99 jhp0839 


Bacillus s^Jbtilis 168 rpsP 


Mus musculus inv 


Streptococcus agalactiae cylB | 


Pyrococcus horikoshii OT3 mtrA ] 


Bacillus subtilis 168 ffti 








Escherichia coli K12ftsY 


db Match 


LU 

a 

oc 
o 

CO 

x' 

UJ 

h- 

b. 
w 


pir;A36940 


pir:H72105 


prf:2108268A 


sp:PCAB_PSEPU 
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O 

LU 

q' 

or 
1- 
d 


^' 

00 

li- 
U 

C/3 

d 

O) 


UJ 

_j 

u 
>- 

2' 

5. 

d 
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00 
CO 


s 

* 


in 

<*— 

J- 

U 

o. 


prf:2512328G 


O 
Oi 

to 
o 

CN 
CN 
CN 

t: 
a 


sp:SR54_BACSU 








_i 
O 

o 

UJ 

1 

> 

CO 
H 
U_ 

d 




M 

CN 
CM 


in 

cn 


GO 
CN 


o> 

CN 


1251 


<Ji 


to 
a> 
ro 


O 
O) 

to 


Oi 
00 


CO 
CO 


ro 
<n 


00 


CJ) 


to 
m 


(O 

QO 


to 

00 


1641 
633 




cn 

CO 
CO 


1530 


Terminal 
(nt) 


2154460 


2156747 


2157754 


2159019 


2159287 


2160768 


2161111 


o 
m 

CO 
CN 


2162196 


2163745 


2163748 


2164737 1 


2164815 


2166098 


2166124 1 


2166990 


2167944 
2171058 


2172131 


2172877 


2173759 


Initial 
(nt) 


2156733 


2157721 


2159181 


2159237] 


2160537 


2160670 


2161503 


2162196 


2163014 


2163098 


2164260 


2164390 j 


2165309 


2165523 


2165990] 


2167865 1 


2169584 
2170426 


2171715 


2172209 


2175288 




§ 
in 


5741 


5742 


5743 


5744 


5745 


5746 


5747 


5748 


Ol 

m 


5750 


5751 


5752 


5753 


5754 


5755 
5756 


5757 


5758 


5759 


SEQ 
NO. 

(UNA) 


2239 1 


o 

CN 
CN 


2241 


2242 


2243 


CN. 
CN 


in 

CN 
CN 


2246 


2247 


2248 


2249 


1 2250 


m 

<N 
CN 


2252 


2253 


1 2254 1 


2255 


2256 


2257 


2258 


2259 
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Function 






glucan 1.4-alpha-glucosldase or 
glucoamylase S1/S2 precursor 




chromosome segregation protein 


acylphosphatase 




transcriptional regulator 


hypothetical membrane protein 






cation efflux system protein 


■formamidopyrlmidine-DNA 
glycosylase 


ribonuclease III 


hypothetical protein 


hypothetical protein 


transport protein 


ABC transporter 


hypothetical protein 




Matched 
length 
(a a) 






1144 




1206 


CM 
Q) 




in 
o 
ro 


in 

CN 






OD 
CO 

T- 


in 

00 
CN 


CN 
CM 


CD 


CO 
CO 
CN 


O) 

in 
in 


in 


CD 
00 
CO 




Similarity 
(%) 














































46.2 




72.6 


73.9 




60.0 


73.5 






76.6 


66.7 


76.5 


62.5 


76.9 


55.6 


58.6 


62.6 




Identity 
(%) 






22.4 




48.3 


51.1 




23.9 


39.3 






46.8 


36.1 


40.3 


35.8 


50.0 


28.3 


26.6 


35.3 




Honr>ologous gene 






Saccharomyces cerevisiae 
S288C YIR019C stal 




Mycobacterium tuberculosis 
H37RV Rv2922c smc 


Mycobacterium tuberculosis ! 
H37RVRV2922.1C 




Escherichia coli Kl2yfeR 


Mycobacterium leprae 
MLCL581.28C 






DIchelobacter nodosus gep 


Escherichia coli K12 mutM or 
fpg 


Bacillus sublilis 168 rncS 


Mycobacterium tuberculosis 
H37RV Rv2926c 


Mycobacterium tuberculosis 
H37RV Rv2927c 


Streptomyces verticillus 


Escherichia coli K12 cydC 


Streptomyces coelicolor A3(2) 
SC9C7.02 




db Match 






sp:AMYH_YEAST 




h- 
O 
> 

1 

m 
to 
o 
>- 

CL 

tn 


Z5 • 
h- 
O 
> 

s 

1 

> 
< 

'6, 
</) 




_i 
O 
u 

ut 

1 

cn 

UJ 

u. 
>- 
d 

cn 


pir:S72748 






CO 

o' 

UJ 

cr 
z 

D 
d 


_j 
O 
o 

UJ 

1 

o 

' d 


pir:B69693 


sp:Y06F_MYCTU 


O 
> 

o' 

to 
o 
>- 

d 
to 


O 
o 

CD 
CN 

o 

CN 

t:" 

Q. 


sp:CYDC_ECOLl 


gp:SC9C7_2 




gt 


o 

in 


CM 
O 


3393 


r> 

CO 
O) 


in 
CD 

CO 


CN 
CO 

eg 


1854 


CO 

in 
CO 


CO 
CO 


CO 
CO 




in 

£ 


00 

in 
00 




■V 
ro 
in 


<Ji 
00 


to 


1530 


CN 
CN 




Ternninal 
(nt) 


175888 


177103 


176110 


181880 


179628 


183110 


183405 


185351 


187129 


187342 


187233 


187692 


188313 


189166 


189906 


190540 


193165 


194694 


198004 1 


198007 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CN 


CM 


CN 


CN 


CN 


CN 


CN 


CN 


CM 


CN 


CM 


CN 


CN 


Initial 
(nl) 


176046 


176402 


179502 


180918 


183092 


183391 


185256 


186208 


186299 


1871601 


187679 


188306 


189170 


906681 


190439 


CO 
CM 
CO 


191522 


193165 


196883 


CO 
O) 




CN 


CN 


CN 


CM 


CN 




CN 


OJ 


CN 


CN 


CN 


CN 


CN 


I CN 


CN 


CN 


: CN 


CN 


CN 


CM 


C/) 2 e 


15750 


5751 


5752 


5753 


5764 


5765 


5766 


5767! 


5768 


57691 


5770l 


5771 


5772 


5773' 


m 


5775 


5776 


5777 


57781 


5779 


C/) 2 Q 


lo 

CO 
CN 
CN 


2261 


2262 


2263 


2264 


2265 


2266 


1 2267 1 


2268 


2269 1 


i CN 1 CN 
1 CN CN 


2272 


2273 


2274 


2275 


2276 


i 
i 

1 ^ 
1 ^ 


OO 

CN 
CN 


[ cn 
! 

1 CM 
1 CN 
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Function 


hypothetical protein 


peptidase 


sucrose transport protein 






maltodextrin phosphorylase/ 
glycogen phosphorylase 


hypothetical protein 


prolipoprotein diacylglyceryl 
transferase 


indole-3-glycerol-phosphate 
synthase / anthranilate synthase 
component 11 


hypothetical membrane protein 


phosphoribosyUAMPcyclohydrolase 


cyclase 


inositol monophosphate 
phosphatase 


phosphoribosylformimino-5- 
arhinoimidazole carboxamide 
ribotide isomerase 


glutamine amidotransferase 


chloramphenicol resistance protein 
or transmembrane transport protein 


15 




Matched 
length 
(a a) 


Ln 
o 


CO 

in 

CO 


CO 

CO 






CO 


m 

CJ> 
CM 


to 

CM 


O) 

to 


CO 
CM 
CM 


cn 

CO 


00 

in 

CM 


CM 


m 

CM 


o 

CN 


CN 
O 
XT 


20 




Similarity 
(%) 


43.7 


CO 

s 


51.9 








66.4 


65.5 


62.1 


58.8 


79.8 1 


97.7 


q 


97.6 


92.4 


54.0 






Identity 
(%) 


21.0 


32.9 


I'lZ 






36.1 


33.9 


31.4 


29.6 


29.4 


00 

rsi 
in 


97.3 


94.0 


95.9 


86.7 


256 


25 
30 
35 


Table 1 (continued) 


Homologous gone 


Thermotoga maritima MSB8 
TM0896 


Campylobacter jejuni ATCC 
43431 hipO 


Arabidopsis thaliana SUC1 | 






Thermococcus litoralis malP 


Bacillus subtills 168 yfiE 


Staphylococcus aureus FDA 485 

Igt 


Emericella nidulans trpC 


Mycobacterium tuberculosis 
H37RV RV1610 


Rhodobacter sphaeroides ATCC 
17023 hisi 


Corynebacterium glutamicum 
AS019hisF 


Corynebacterium glutamicum 
AS019 impA 


Corynebacterium glutamicum 
AS019hisA 


Corynebacterium glutamicum 
AS019hisH 


Streplomyces lividans 66 cmIR 


40 




db Match 


pirA72322 


UJ 
< 
0' 

Q. 
X 

cL 
(/) 


pir:S38197 






< 
o 

CO 

in 

CM 
CL 


D 
<n 
o 
< 

1 

UJ 

> 

CL 

1/) 


sp:LGT^STAAU 


2 
Ul 

UJ 

1 

O 
CL 

h- 

d 


pir;H70556 


sp:HIS3_RH0SH 


O 

{£. 
O 
U 

X 

CL 
Ul 


m 
to 

01 
OJ 

t: 
a. 


CO 
r- 

in 
o 
u. 
< 

CL 


gp:AF060558_1 


-1 
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1 

_> 
5 

U 
id 
tn 






ORF 

/hn\ 


CD 
CM 


1263 


CD 
fO 
CO 




to 
r- 
CN 


2550 


O 
O 
O) 


CO 

cn 


o 

00 


in 
to 


in 

CO 


|v. 


in 

CM 
00 


00 
CO 


CO 
fO 

to 


to 
to 

CN 


45 




Terminal 
(nt) 


2199758 


2201070 


2201073 


o 
tn 

"V" 

o 

CM 
CM 


2201594 


2201992 


2204591 


2207302 


2208367 


2209232 


2209920 


2210273 


2211051 


2211882 


2212641 


2214321 


50 




Initial 
(nt) 


2198475 


2199808 


2201408 


2201584 


2201869 


2204541 


2205490 


2208249 


2209167 


2209888 


2210273 


(D 
O 

CM 
CM 


in 

00 
r- 
T— 

CN 
CN 


2212619 


2213273 


2215580 






(O 2 « 


5780 


5781 


CM 
CX3 

in 


5783 


5784 


57851 


57R6 


5767 


5788 


5789 


57901 


r- 

r- 
in 


5792 


5793 


5794 


5795 


55 




Sgi 

CO 2 C 


2280 


2281 


CN 

oc 

CN 

Cv 


2283 1 


2284 


2285 


to 

CO 
C\ 


2287 


2288 


2289 


2290 


2291 


2292 


' 2293 


2294 


2295 
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CO 
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o 
o 



Function 


short chain dehydrogenase or 
general stress protein 


diaminopimelate (DAP) 
decarboxylase 


cysteine synthase 




ribosomal large subunit 
pseudouridine synthase D 


lipoprotein signal peptidase 




oleandomycin resistance protein 




hypothetical protein | 


L-asparaginase { 


DNA-damage-inducible protein P 


hypothetical membrane protein | 


transcriptional regulator 




hypothetical protein 


isoleucyl-lRNA synthetase 






Matched 
length 
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61.0 


61.7 




64.0 
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65.4 
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33.8 
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31.2 1 
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31.5 


CO 




0 


38.5 






Homologous gene 


Bacillus subtilis 168 ydaD 


Pseudomonas aeruginosa lysA 


Alcaligenes eutrophus CH34 
cysM 




Escherichia coli K12 rluD 


Pseudomonas fluorescens NCIB 
10586 IspA 




Streptomyces antibioticus oleB 




Rhodococcus erythropolis orfl7 


Bacillus licheniformis { 


Escherichia coli K12 dinP 


Escherichia coli K1 2 ybiF | 


Streptomyces coelicolor A3(2) 
SCF51.06 




Streptomyces coelicolor A3{2) 
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A364A YBL076C ILS1 
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Function 


66 
E g S 

<o E <D 

>^ i c 

CD ro 
o CN ro 

to -L ' 

in 

a. 2 c 

O 3 ra 
D ai ro 


penicillin binding protein 


penicillin-binding protein 




hypothetical protein 


hypothetical membrane protein 


hypothetical protein 




hypothetical protein 


5. 1 0>methylenetetrahydrofolate 
reductase 


V) 

ro 

m 
C 

ca 

tn 
c 

CO 

^ 
*c6 

E 


hypothetical membrane protein 


! 


hypothetical protein 


eukaryotic-type protain kinase 


i 


hypothetical membrane protein 
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(aa) 
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Similarity 
(%) 




































67.6 


100.0 


58.8 




m 

CD 


88.8 


69.3 




65.3 


70.6 


62.0 


69.6 




68.8 


62.4 




58.4 


Identity 
(%) 


37.7 


100.0 


28.2 




56.1 


72.0 


39.4 , 




36.3 


42.6 


CO 


35.7 




43.2 


34.2 




30.7 


Homologous gene 


Bacillus subtilis 168 murE 


Brevibacterium lactofermentum 
ORF2pbp 


Pseudomonas aeruginosa pbpB 




Mycobacterium tuberculosis 
H37RV Rv2165c 


Mycobacterium leprae 
MLCB268.nc 


Mycobacterium tuberculosis 
H37RV RV2169C 




Mycobacterium leprae 
MLCB268.13 


Streptomyces lividans 1326 
metF 


Myxococcus xanthus DK1050 
ORF1 


Mycobacterium leprae 
MLCB268.17 




Mycobacterium tuberculosis 
H37RvRv2175c 


Streptomyces coelicolor A3(2) 
pkaF 




Mycobacterium leprae 
MLCB268.23 
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m 
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2290973 1 


2291212 


2293323 
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00 
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2302685 


2302251 


o 
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O 
CO 
CN 


2303040 
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2291073 


2291197 


2293164 


O) 
CN 
CN 


2295127 
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2296898 


2297653 
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2299428 
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g 

n 



Function 


hypothetical membrane protein 


3-deoxy-D-arabino-heptuIosonate-7- 
phosphate synthase 


hypothetical protein 


hypothetical membrane protein 


major secreted protein PS1 protein 
precursor 






hypothetical membrane protein 


acy transferase 


glycosyl transferase 


protein P60 precursor (invasion- 
associated-protein) 


protein P60 precursor {invasion- 
associaled-protein) 


ubiquinol-cytochrome c reductase 
cytochrome b subunit 


ubiquinol-cytochrome c reductase 
iron-sulfur subunit (Rieske (eFe-2SJ 
iron-sulfur protein cyoS 


ubiquinol-cytochrome c reductase 
cytochrome c 
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83.1 


Identity 
(%) 
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58.4 


35.1 


28.2 






100.0 
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26.4 


33.0 


34.3 


37.9 


CO 
00 

in 


Homologous gene 


Mycobacterium tuberculosis 
H37RV RV2181 


Amycolatopsis mediterranei 


Mycobacterium leprae 
MLCB268.21C 


Mycobacterium tuberculosis 
H37RV RV2181 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 






Corynebacterium glutamicum 
ATCC 13032 


Corynebacterium glutamicum 
ATCC 13032 


Streptomyces coelicolor A3(2) 
SC6G1 0.05c 


Listeria ivanovii iap 


Listeria grayi iap 


Heliobacillus mobiJis petB 


Streptomyces lividans qcrA 


Mycobacterium tuberculosis 
H37RV Rv2194 qcrC 
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in 

CO 


1143 
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CO 


1602 


(N 
CO 


in 

CO 
00 


Terminal 
(nt) 


2307521 


2307697 


2309173 


2312252 


2313808 

1 


2314036 


2313915 


2314236 


2315678 


2317633 


2318804 


2319968 


2321472 


2323088 


1 

2324311 


Initial 
(nt) 


2306314 


2309082 


2309676 


2309835 


2312360 


2313833 


2314092 


2315423 


2316412 


2318775 


2319850 


2320594 


2323073 


2323759 


2325195 


go ? 


5887 


5888 


5889 


5890 


5891 


5892 


5893 


5894 


5895 


5896 


5897 
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5899 
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5901 


SEQ 
NO. 
(DNA) 
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ra 



Function 


1 cytochrome c oxidase subunit III 




hypothetical membrane protein 


cytochrome c oxidase subunit il 


glutamine-dependent 
amidotransferase or asparagine 
synthetase (lysozyme insensitivity 
protein) 


hypothetical protein 


hypothetical membrane protein 


cobinamide kinase 


nicotinate-nucleotide— 
dimethytbenzimidazole 
phosphortbosyltransferase 


cobalamin (5*-phosphate) synthase 1 




clavulanate-9-aldehyde reductase 


branched-chain amino acid 
aminotransferase 


leucyl aminopeptidase 


hypothetical protein 


dihydrolipoamide acetyltransferase 




lipoyltransferase 


Matched 
length 
(aa) 


00 
OD 




in 


cn 


o 

CO 




CO 
CN 


CN 

t— 




in 
o 
(n 




X— 

CN 


to 
n 


fO 

cn 


cn 


(D 




o 

CM 


Similarity 
(%) 


1 70.7 




71.0 


1 53.9 


99.8 


100.0 


60.2 


64.0 


66.9 


49.8 I 




68.5 


70.3 


65.9 


67.0 


68.5 




65.7 


Identity 
(%) 


1 36,7 




38.6 


28.7 


99.7 


100.0 


35.0 ; 


43.0 1 


37.8 


25.3 




38.6 


40.1 


36.3 


40.2 


48.9 




36.7 


Homologous gene 


1 Synechococcus vulcanus 




Mycobacterium tuberculosis 
H37RV Rv2199c 


Rhodobacter sphaeroides ctaC 


Corynebacterium glutamicum 
KY9611 Its A 


[corynebacterium glutamicum 
|KY9611orf1 


Mycobacterium leprae 
MLCB22.07 


Rhodobacter capsuiatus cobP 


Pseudomonas denitrificans 
cobU 


Pseudomonas denitrlftcans cobV 




Streptomyces clavuligerus car 


Mus musculus BCAT1 


Pseudomonas putida ATCC 
12633 pepA 


Saccharopolyspora erythraea 
ORF1 


Streptomyces seoulensis pdhB 




Arabldopsis thaliana 


db Match 


3 
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> 

X 

o 
u 

Q. 

(/I 




sp:Y00A_MYCTU 


|sp;COX2_RHOSH 


gp:AB029550_1 
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D 
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o 
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cn 
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CD 

r- 
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in 


1089 


CN 
CJ) 


cn 

CM 
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ro 

CJ> 
CO 
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CO 

m 
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(nt) 


1 2325273 


1 2326121 


2326472 


2326921 


2330435 


2330586 


2331967 


2332495 


2333600 


2334535 


1 2334481 


2335028 1 


2335915 


2338734 


2338748 


2341293 


2339440 


2342164 


Initial 
! (nt) 


2325887 


2326273 


2326900 


2327997 


2328516 


2330927 


2331200 


2331974 


2332512 


j 2333615 


2334717 


2335741 


2337051 


2337235 


2339140 


2339269 


2340804 


2341412| 


n ■ CM 


5903 


5904 


5905 


5906 


5907 


5908 


5909 
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5911 


5912 


5913 


5914 


5915 


5916 


5917 


5918 


5919| 
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NO. 
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00 
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5 
10 




Function 


lipoic acid synthetase 


hypothetical membrane protein 


hypothetical membraae protein 


.»->» 
CN 
O) 
O 

oT 

U) 
CD 
t/j 
O 

cx 

(/> 
c 




hypothetical membrane protein 




mutator mutT domain protein 


hypothetical protein 




alkanal monooxygenase alpha chain 
(bacterial lucaferase alpha chain) 


protein synthesis inhibitor 
(translation initiation inhibitor) 






4-hydroxyphenylacetate permease 


transmembrane transport protein 


transmembrane transport protein 






15 




Matched 
length 
(aa) 


m 

00 
CN 


(N 


CD 

in 
in 


5 
%f 




in 




ID 


oo 

CN 




o 

OJ 
CN 








CO 
CO 


00 

in 




















































20 




Similar! 
(%) 


70.9 




67,8 


100.0 




63.7 






65.6 




60.9 


73.0 






53.4 


72.8 


66.1 










Identity 
(%) 


CO 


45.5 


32.9 


100.0 




41.4 




31.0 


36.7 




25.0 


40.5 






21.9 


42.4 


31.4 






25 
30 
35 


Table 1 (continued) 


Homologous gene , 


Pelobacter carbinolicus GRA BO 
1 tipA 


Mycobacterium tuberculosis 
H37RvRv2219 1 


Escherichia coli K12 yidE I 


Corynebacterium glutamicum 
ATCC 1 3032 tnp | 




Streptomyces coelicolor A3(2) 1 
SC5F7.04C 






Thermotoga maritima MSB8 
TM1010 




Vibrio harveyi luxA 


Thermotoga maritima MS 88 
TM0215 






Escherichia coli hpaX 


Streptomyces coelicolor A3 (2) 
SCGD3.10C 


Streptomyces coeJicolor A3(2) 
SCGD3.10C 
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cn 
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CO 
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in 
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o 
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2343347 


CO 

m 

CN 
V 
''T 
CO 
CN 
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2346289 


2347804 


2348078 


2350408 


2351996 1 


2350912 ' 


2351310 


2352828 


2353225 


2355398 


2355180 


2356843 


2357354 


2357707 


2357290 j 


2358130 


50 
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(nt) 


2342304 


2343479 


CO 

•V 
'<r 

CO 
CN 


2347491 


2347505 


CO 
NT 

to 
to 

CO 


2350620 ! 


2351022' 


2351310 1 
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2351980 


2352833 


2355156 


o 

in 
in 

CO 
CN 


2355521 


2356794 


2357264 


2357484] 


2357726 
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(aa.) 
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5925 
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O) 
CN 
O) 
LO 
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5938 
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ON 
D3S 
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CN 
CN 
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CN 
CN 
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CN 

CM 
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CN 

CN 
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o 
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CN 
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CO 
CO 

CN 


ro 

CN 
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ro 

CN 


2436 


ro 
(N 


2438 
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C 



a; 



Function 




heme oxygenase 


a> 

(0 

.g> 

0 ro 
E oi 

ig 
1^ 

E 

ra c 

01 ro 


glutamine synthetase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


galactokinase 


virulence-associated protein 




bifuncttonal protein (ribonudease H 
and phosphoglycerate mutase) 




hypothetical protein 


hypothetical protein 


phosphoglycolate phosphatase j 


low molecular weight protein- 
tyrosine-phosphatase 


hypothetical protein 


insertion element (IS402) 


Matched 
length 
(a.a) 




CN 


C7) 

o 

00 




CN 

a> 
ro 


o 

CO 


m 


ro 


00 

in 
ro 




CN 
00 

ro 




Oi 
CN 


CO 

CO 


O 
CN 


CD 

in 


QO 
CN 


CN 


Similarity 




78.0 


67.0 


73.0 


54.1 


58.2 


55.6 


53.7 


in 
in 




75.1 




58.6 


76.2 


54.4 


63.5 


65.5 


56.6 


Identity 
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57.9 


43.4 


43.6 


26.8 


33.4 


38.9 


24.9 


27.1 




54.7 
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49.2 


26.0 j 


1 46.2 


! 40.9 


32.6 


Homologous gene 
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■c 

ro 

<u 
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o E 
O .c 


Streptomyces coelicoior A3(2) 


Thermotoga maritima MSB8 
gInA 


Streptomyces coelicoior A3(2) 
SCE939C 


Mycobacterium tuberculosis 
H37RV RV2226 


Streptomyces coelicoior A3(2) 
SCC75A.11C. 


Homo sapiens galK1 


Brucella abortus vacB 




Mycobacterium tuberculosis 
H37RV Rv2228c 




Mycobacterium tuberculosis 
H37RV RV2229C 


Mycobacterium tuberculosis 
H37RV RV2230C 


Escherichia coli K12 gph 


Streptomyces coelicoior A3(2) 
SCQ1 1.04c ptpA 


Mycobacterium tuberculosis 
1 H37RV RV2235 


Burkholderia cepacia 
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CO 
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CO 
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in 

CO 




rr 

in 
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CO 
CD 
CO 
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2359614 


t— 

2362818 


2365455 


2367413 


2367473 


2369083 


2369116 


2370908 


2371412 


2373289 1 
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2375684 


2376720 


2376998 
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2359416 


2362748 
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in 
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c\ 
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2370381 


2370423 
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2373269 
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2375214 


2375767 


2377390 
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(a.a.) 1 
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in 
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in 
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5 
10 


Function 




transcriptional regulator 




hypothetical protein 




pyruvate dehycJrogenase component 




ABC transporter or glutamine 
transport ATP-binding protein 




ribose transport system permease 
protein 


hypothetical protein 


calcium binding protein 




lipase or hydrolase 


acyl carier protein 


N-acetylglucosamine-6-phosphate 
deacetylase 


hypothetical protein 




IS 
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length 
(aa) 




in 








a 

T— 

O) 








CO 
CO 
CN 
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(N 


in 

CN 
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(N 

in 

CO 


in 
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m 

(N 


a 

00 
CN 




20 


Similarrty 
(%) 
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1 78.9 




62.8 




58.7 


62.9 


55.2 




55.7 


80.0 


75.5 


65.7 
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30.4 




55.2 




55.9 1 




33.7 

i 




25.4 1 


26.2 


to 




29.6 


42.7 


43.9 


33.6 




25 

3 
C 

c 
o 
u 

30 ^ 
35 


Homologous gene 




Streptomyces coelicolor A3(2) 
SC6F4.22C 




Mycobacterium tuberculosis 
H37RV Rv2239c 




Streptomyces seoulensis pdhA 




Escherichia coli K12 glnQ 




Bacillus subtiiis 168 rbsC 


Rickettsia prowazekii Madrid E 
RP367 


Dictyostefium discoideum AX2 
cbpA 




Streptomyces coelicolor A3(2) 
SC6G4.24 


Myxococcus xanthus ATCC 
25232 acpP 


Escherichia coli K12 nagD 


Deinococcus radiodurans 
DR1192 
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1 
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o 

UJ 
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o 

1 
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O 
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cL 
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, Be 


(N 


00 

1 


00 

a> 


Ol 
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m 
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a> 

CO 


CO 

to 
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CO 

CO 
CO 


Ol 
CO 
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o 

OO 


CN 
CO 
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O) 
CN 


m 

CN 
CO 
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45 


Terminal 
(nt) 


2377484 


2378276 


2378489 


2378884 


2379770 


2382744 


i 2380765 


1 

2382827 


2385426 1 


2383622 1 


2384509 


2386580 1 


2385913 


2386614 


2387957 


2388821 


2389869 


CO 

o 
tn 

CO 
CN 


50 


(nt) 


to 

CN 

. cn 

1 CN 


2377899 


2378292 


2379312 


2379426 


2380033 


2382240 


2383615 


to 

CO 
CO 
CN 


2384509 1 


2385447 , 


2385771 


2386284 


2387627 


2387667 


2387997 


2388838 


2390904 




SEQ 
NO 

(a.a.) 


5957 


5958 


5959 


5960 


5961 


5962 


1 5963 


CO 

cn 
in 


5965 


5966 


5967 


5968 


5959 


5970 


5971 


5972 


5973 


5974 


55 


SEQ 
NO. 
(DNA) 


2457 


tX) 

in 

CN 


1 2459 


2460 


2461 


2462 


' CO 

, to 

CN 


to 

CN 
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to 

CN 


CO 

to 

CN 


1 

2467 
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to 

CN 


1 
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2473 


CN 
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15 



I 

20 
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40 



45 



50 



55 



Function 


hypothetical protein 












alkaline phosphatase D precursor 




hypothetical protein 


hypothetical protein 




DNA primase 


n> 
CO 
a> 

tn 
(TJ 

a> 
o 

c 
o 

JO 






L-glutamine:D-fructose-6-phosphate 
amidotransferase 






deoxyguanosinetriphosphate 
triphosphohydrolase 


hypothetical protein 


Matched 
length 
(aa) 


CN 












o 
ro 
to 




o> 
in 


CO 




ro 
ro 

CO 


00 

o> 






to 

CO 

to 










Similarity 
(%) 


75.3 












64.7 




73.1 


CM 




82.9 


CD 






CN 
CN 
GO 






76.3 


59.7 


Identity 


(vi 
m 












34.2 




44.4 


41.2 




59.1 


49.0 






59.1 






54.6 


o 

CO 


Homologous gene 


Streptomyces coelicolor A3(2) 
SC4A7.08 












Bacillus subtilis 168 phoD 




Streptomyces coelicolor A3(2) 
SCI51.17 


Mycobacterium tuberculosis 
H37RV RV2342 j 




Mycobacterium smegmatis 
dnaG 


Streptomyces aureofaciens BMK 






tn 
TO 

E 

OJ 

E 

Ic/) 

-5 ^ 
tn 
J3 in 
o 

O OJ 

u 

S E 






Mycobacterium smegmatis dgt 


Neisseria meningitidis NMA0251 


db Match 


1 

< 

U 
CO 












sp:PPBD_BACSU 




■r- 

i 

in 

U 

CO 
d 
cn 


pir:G70661 




prf:2413330B 


il' 

CO 
TT 
O) 
CO 

D 

Q. 

cn 






GO 

to 

00 

in 
o 

< 

b. 






prf:2413330A 


to 
^1 

CN 
N 

< 

z 

id. 

0)in 


u 


to 

00 


CN 

o 


r«. 


CO \n 

^ CO 
in I 


CN 

n 


1560 


714 
1836 


a 

CN 


m 

CO 


1899 


CM 
CO 


ro 

CM 


(O 

to 

CD 


1869 


CN 

ro 


1152 


1272 


in 
co 


Terminal 
(nt) 


2391184 


2392075 


2392579 


2393970 
2393973 


2394935 


2396763 


2395273 


2399099 


2399397 


2399668 


2399405 


2401834 


2402080 


! 2402530 1 


2402144 


CD 
00 

o 

CM 


2406822 1 


2404987 


2406262 


Initial 
(nt) 


23S200B 


2392566 


2393349 


2393425 


2394437 


2394594 


2395204 


2395986 


2397264 


2399158 


2400342 


2401303 


2401373 


2401838 


2403165 


2404012 


2404S23 


2405671 


2406258 


2406936 


w z s 


5975 


5976 


5977 


5978 


5979 


5980 


5981 


5982 


5983 


5984 


5985 


5986 


5987 
5988 


5989 


5990 


599 lj 


5992 


5993 


C7> 
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NO. 
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2475 


2476 


CM 


CO 

h- 
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<N 


CO 
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CM 


00 
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CN 
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00 
CN 


2486 ; 
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2490 
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. 

CM 
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3 



.CO 



Function 


hypothetical protein 


hypothetical protein 




glycyl-tRNA synthetase 


bacterial regulatory protein, arsR 
family 


ferric uptake regulation protein 


hypothetical protein (conserved in 
C.glulamicum?) 


hypothetical membrane protein 


undecaprenyl diphosphate synthase 


hypothetical protein 


Era-like GTP-binding protein | 


hypothetical membrane protein 


hypothetical protein 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


phosphate starvation inducible 
protein 


hypothetical protein 




Matched 
length 
(a.a) 


CM 
O) 
(O 


CO 

r) 






00 


CM 
CO 


in 




CO 
CO 
CM 


in 


CN 


CO 


in 


in 

00 


CO 


CD 
CN 




Similarity 
{%) 


63.6 






69.9 


73.0 


70.5 


46.7 


67.0 


71.2 


(O 


70.3 


CM 
00 


86.0 


50.0 


B4.6 


75.4 




Identity 
(%) 


31.1 


24.6 




46.1 




34.9 


24.8 


40.6 


CO 


45.7 


39.5 


52.8 


65.0 


45.0 


61.1 


44.0 




Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv2345 


Drosophila melanogaster 
CG 10592 




Thermus aquaticus HB8 


Mycobacterium tuberculosis 
H37RV RV2358 furB 


Escherichia coli K12fur 


Mycobacterium tuberculosis 
H37RV Rv1128c 


Streptomyces coeticolor A3(2) 
h3u 


Micrococcus luteus B-P 26 uppS 


Mycobacterium tuberculosis 
H37RV Rv2362c 


Streptococcus pneumoniae era 


Mycobacterium tuberculosis 
H37RV Rn/2366 


Mycobacterium tuberculosis 
H37RV Rv2367c 


Neisseria meningitidis 


Mycobacterium tuberculosis 
H37RV Rv2368c phoH 


Streptomyces coelicolor A3(2) 
SCC77.19C. 




db Match 


CM 

o 
cn 

L. 
Q. 


gp;AE003565_26 




pir;S58522 


in 

00 

m 
o 
r** 

UJ 
Q. 


Zj 
0 
u 

UJ 
D 

UL 

'd. 
cn 


pir:A70539 


1 

00 
CO 

Oi 

CM 
CO 

u. 

< 

Q. 

n\ 
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O 
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0. 
D 

iOL 
V) 


CO 
CO 
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a 

< 

l: 
'5. 


00 
CM 

O 

!^ 

'6. 

O) 


5 

a 
> 

icL 
(/) 


sp:YN67_MYCTU 


GSP:Y75650 


sp:PHOL_MYCTU 


o 
o 
to 

cL 

O} 






2037 


00 


CM 
«> 

in 


1383 


(O 
CO 


CN 
CO 


1551 


CM 
CD 


CD 
CM 


CO 
CM 


in 

(D 


1320 


CO 
00 

in 


CO 
CM 


1050 


CO 
CM 


cm! 

^J- ! 

cn ! 


Terminal 
(nt) 


2409029 


2409779 


2410280 


2410956 


2412948 


2413423 


2415118 


2415298 


r) 

CO 
CN 


2417222 


2417969 


2418990 


2420313 


2421236 


2420900 


2421975 


2423791 


Initial 
(nt) 


2406993 


2410264 


2410861 


2412338 


2412580 


2412992! 


2413568 


241 6089 


2417099 


2417947 


2418883 


2420309 


2420900 


2420973 


2421949 


2422697 


2422850 
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NO. 
(a.a.) 


5995 
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5997 


5998 


5999 
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6001 


6002 
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6005 1 
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0) 
X3 



Function 


isopentenyl-diphosphate Delta- 
isomerase 












beta C-S lyase (degradation of 
aminoethylcysteine) 


branched-chain amino acid transport 
system carrier protein (isoleucine 
uptake) 


alkanal monooxygenase alpha chain 




malonate transporter 


giycolate oxidase subunit 


transcriptional regulator 1 




hypothetical protein 




heme-binding protein A precursor 
(hemin-binding lipoprotein) 


oligopeptide ABC transporter 
(permease) 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


Matched 
length 














in 

CN 

n 


TT 


to 




TT 

CN 
CO 


CO 
CO 


CO 

o 

CN 




to 
^ 




to 
in 


in 


CN 


CM 
N- 
CO 


Similarity 


57,7 












100.0 


100.0 


49.0 




60.5 


55.1 


65.0 1 




57.6 




55.5 


73.3 j 


in 
v' 


to 

CD 


Identity 


CO 












99.4 


99.8 


21.6 




25.9 


27.7 


25.6 




22.5 




27.5 


40.0 


43.2 


CO 


Homologous gene 


Chlamydomonas reinhardtii ipll 












Corynebacterium glutamicum 
ATCC 13032 aecD 


Corynebacterium glutamicum 
ATCC 13032 brnQ 


Vibrio han/eyi luxA 




Sinorhizobium meliloti mdcF | 


Escherichia coli K12 glcD | 


Escherichia coli K12ydfH 




Salmonella typhimurium ygiK 




Haemophilus influenzae Rd 
HI0853 hbpA 


Bacillus subtilis 168 appB 


Escherichia cofi K12 dppC 


Escherichia coli K12 oppD 


db Match 


pir;T07979 
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hypothetical protein | 


hypothetical protein 


ribose kinase 


hypothetical membrane protein 
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hypothetical protein 


30S ribosomal protein S20 


thrreonine efflux protein 


ankyrin-like protein 


hypothetical protein 


late competence operon required for 
DNA binding and uptake 


late competence operon required for 
DNA binding and uptake 




hypothetical protein 


phosphoglycerate mutase 


hypothetical protein 


hypothetical protein 




gamma-glutamyl phosphate 
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xanthine permease 


2.5-diketo-D-gluconic acid reductase 






SOS ribosomal protein L27 


50S ribosomal protein L21 
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folyl-polyglutamate synthetase 








valyl-tRNA synthetase 


oligopeptide ABC transport system 
substrate-binding protein 


heat shock protein dnaK 
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malate dehydrogenase 


transcriptional regulator 
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malonate transporter 


class-Ill heat-shock protein or ATP- 
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protocatechuate catabolic protein | 


beta-ketothioiase 




3-oxoadtpate enol-tactone hydrolase 
and 4-carboxymuconolactone 
decarboxylase 
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Function 


toluate 1,2 dioxygenase subunit 


toluate 1,2 dioxygenase subunit 


1 .2-dihydroxycyclohexa-3.5-diene 
carboxylate dehydrogenase 


regulator of LuxR family with ATP- 
binding site 


transmembrane transport protein or 
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benzoate membrane transport 
protein 


ATP-dependent CIp protease 
proteolytic subunit 2 


ATP-dependent CIp protease 
proteolytic subunit 1 


hypothetical protein 
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Function 






galactose-6-phosphate isomerase 


hypothetical protein 


hypothetical protein 


aminopeptidase N | 


hypothetical protein 








phytoene desaturase 






phytoene dehydrogenase 


phytoene synthase 


multidrug resistance transporter | 




ABC transporter ATP-binding protein 


dipeptide transport system 
permease protein 


nickel Uanspon system permease 
protein 
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Staphylococcus aureus NCTC 
8325-4 lacB 


Bacillus acidopullulyticus 0RF2 
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Streptomyces lividans pepN 
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9175 crti 






Myxococcus xanthus DK1Q50 
carA2 


Streptomyces griseus JA3933 
crtB 


Listeria monocytogenes lltB 




Synechococcus elongatus 


Bacillus firmus OF4 dppC 
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Function 




acetylornithine aminotransferase 


hypothetical protein 


hypothetical membrane protein 


acetoacetyl CoA reductase 


transcriptional regulator, TetR family | 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


ABC transporter ATP-binding protein 


globin 


chromate transport protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical membrane protein 


alkaline phosphatase { 
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length 
(aa) 
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Homologous gene 




Corynebacterium glutamicum 
ATCC 13032 argO 


Mycobacterium tuberculosis 
H37RvRv1128c 


Mycobacterium tuberculosis 
H37RV Rv0364 


Chromatium vinosum D phbB 


Streptomyces coelicolor acti 1 . | 


Neisseria meningitidis 


Pseudomonas putida GM73 
ttg2A 


Mycobacterium leprae 
MLCB1610.14C 


Pseudomonas aeruginosa 
Plasmid pUM505 chrA 


Mycobacterium tuberculosis 
H37RV Rv2474c 


streptomyces coelicotor A3(2) 
SC6O10.19C 




Aeropyrum pernix K1 APE1182 


Escherichia coli K12 yjjK 


Mycobacterium tuberculosis 
H37RV Rv247Bc 


Mycobacterium leprae o659 


Bacillus subtilis phoB 


db Match 




sp:ARGD_CORGL 


pir:A70539 


t- 
O 
> 

> 
CL 
(/> 


> 

X 

o 

od' 
cn 

X 
CL 

CL 
l/l 


CD 

s 

a 
< 
a. 


m 

CO 

> 

a 
O 


r— 

! cn' 

1 o 
o 

CO 

o 
lZ 

< 

iCL 
Ol 


CO 

5 
u 
-J 

CL 

cn 


UJ 

s 

CO 

CL 

<' 

on 

X 

u 

CL 

(/) 


pirA70867 


o' 

o 

CO 

O 

CO 

cL 

CJ> 




pirB72589 


-J 
O 

u 

UJ 

—i 

>- 

cL 
in 


pir:E70867 


UJ 

_j 
u 
>• 

J 

in 
o 
> 
cL 
tn 


pir:C69676 


si 


1941 




1584 




GO 

o 


CO 


r— 


CN 

cn 


CO 
C3> 
CO 


1128 


CN 
CO 


un 

CD 


CN 
CO 


CN 
CD 


CO 
CD 

to 


in 

CD 


2103i 


1419 


Ternr^inal 
(nt) 


2584504 


2585926 


2587763 


2588722 


2588725 


2590302 


2591137 


2591574 


2592794 


2593965 


2593968 


2594597 


2595188 i 


2595822 


2596048 j 


2597869 ' 


2598662 


2602879 


Initial 
(nt) 


2582564 


2584613 


2586180 


2587976 


2589432 


2589565 


2590697 


2592365 


2592402 


2592838 


V 
Oi 

LO 

un 

CN 


2595061 


2595808 i 


2595983 


25977151 


2598483 


2600764 


2601461 


SEQ 

NO. 
(a.a.) 


6180 


6181 


CN 
CO 

CO 


6183 


6184 


6185 


6186 


6187 


6188 


6189 


6190 


6191 


6192 


6193 




■ 1 
6195 


6196 


6197 


SEQ 
NO. 
(D^JA) 

2680 


2681 


2682 


2683 


2684 


12685 


2686 


2687 


2688 


2689 


2690 


2691 


2692 ' 


2693 


£3i 
CO 
CN 


2695 


2696 


2697 



177 



8/17/2007, EAST' Version: 2.1.0.14 



EP1 108 790 A2 



t: 
o 
o. 
(/} 
c 

to c 
^ <u 

C (U 
CD t 



^ O 

- E 
E ^ 



01 ^ o 

C Q. CL 

-9 S-^ 

2 w <i> 
(A <u o 

C Q. '-^ 



< 



< ^ o 



(O 

00 



0} 



00 



o 
E 



V) 



(O 
GO 



O 
CO 



E 
o 

I 



^ E 

^ P 



O.CD 
£ CD 
W2 



i-E 

5 cr 

21 O 



5ui 

o 

E 

i 



.1^ 
E 



E 



^ o 



3 h- 

T» E 



Z3 
O 

o 
o 
o 
o 



<0 



I- 

I 

C/} 



5 



o 

CM 

ai 

CO 

to 



< 

n 
00 
o 

CO 

t 
O. 



< 
CO 



<N 

t: 
a. 



CN 

t: 

Ol 



CO 
CN 

< 



01 CN 
CO I ^ 

10 o 



00 



CN 



in c:> 

TT . O 

C7) CO 

CO ; 

o i o 

iO CO 

CN ' CM 



10 

o 
to 

CM 



(O 

to 
<o 
o 

CO 



SP 

CN 



> 

^ > 
o q: 

>.ro 
5 X 



CO . o 

CO . CN 

in in 

Tj- tn 

o ! o 

CO i CO 



2§ s 

w 2 re. 



o> o 

C3> O 

, CN 

CO CO 



01 

CO 

s 



o 

CN 



o 

CN 
CO 



CN 



HI 

W 2 Q 



C3> ! C3 

CD O 

CD , 

CM CN 



CN 
O 

(N : 



o 

CN 



CN 



178 



8/17/2007, EAST Version: 2.1.0.14 



EP1 108 790 A2 



0) 

JD 
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iradation 










pyrazinamidase/nicottnamidase 




lory protein 
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c 


Function 


ferric enterochelin ester 


lipoprotein 








transposase (IS1207) 






transcriptional regulatof 


glutaminase 


sporulation-specific deg 
regulator protein 




uronate Isomerase 




hypothetical protein 


hypothetical protein 


bacterioferriltn comigral 


bacterial regulatory prol 
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Homologous gene 


Salmonella enterica iroD 


Mycobacterium tuberculosis 
H37RV Rv25l8c IppS 








Corynebacterium glutamicum 
ATCC 21086 






Salmonella typhimurium KP1001 
cytR 


Rattus norvegicus SPRAGUE- 
DAWLEY KIDNEY 


Bacillus subtilis 168 degA 




Escherichia coli K12 uxaC 




Zea diploperennis perennial 
teosinte 


Mycobacterium avium pncA 


Mycobacterium tuberculosis 
H37RV Rv2520c 


Escherichia coli K12 bcp 


Streptomyces coelicolor A3(2) 

scm.oic 
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Table 1 (continued) 


Homologous gone 


Corynebacterium 
ammoniagenes ATCC 6871 ppti 


Corynebacterium glutamicum 
ImrB 


Synechocystis sp, PCC6803 




Corynebacterium 
ammoniagenes fas 


Streptomyces coelicolor A3(2) 
SC4A7.14 


Mycobacterium tuberculosis 
H37RV Rv0950c 


Mycobacterium tuberculosis 
H37RvRv1343c 
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B1549_F2_59 


Mycobacterium tuberculosis 
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Pseudomonas aeruginosa 
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D-glutamate racemase 




bacterial regulatory protein. marR 
family 


hypothetical membrane protein 


1 


endo-type 6-aminohexanoate 
oligomer hydrolase 


hypothetical protein 


hypothetical protein 
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ribonucleotide reductase beta-chain 


ferritin 


sporulation transcription factor 


iron dependent repressor or 
diptheria toxin repressor 


cold shock protein TIR2 precursor 


hypothetical membrane protein 


ribonucleotide reductase alpha- 
chain 




503 ribosomal protein L36 


NH3-dependent NAD(+) synthetase 






hypothetical protein 
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alcohol dehydrogenase 


Bacillus subtilis mmg (for molher cell 
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hypothetical membrane protein 


hypothetical membrane protein 
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hypothetical protein 
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hypothetical protein 


transcriptional regulator 
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Table 1 (continued) 
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Deinococcus radiodurans R1 
DR1644 


Coxieila burnetii Nine Mile Ph 1 
sucD 


Aeropyrum pernixKI APE1069 


Bacillus subtilis 168 sucC 




Streptomyces roseofulvus frnE 




Clostridium kluyveri cati cat1 


Azospiriltum brasilense ATCC 
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permease protein 


ro 
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Function 




phosphate transport system 
regulatory protein 


phosphate-specific transport 
component 


phosphate-binding protein S- 
precursor 


acetyttransferase 




hypothetical protein 
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hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


5'-phosphoribosyl-N- 
formylglycinamidine synthetase 




5'-phosphoribosyl-N-. 
formylglycinamidine synthetase 


hypothetical protein 




gluthatione peroxidase 


extracellular nuclease 




hypothetical protein 


C4-dicarboxylate transporter 


dipeptidyl aminopeptidase { 


Matched 
length 
(aa) 


CN •; 


in 
ro 


CN 


CN 


CO 
CD 




CO 
tN 
CM 


O) 

r- 




CO 

m 


lO 
(D 
O) 




CM 




cn 

CO 


Similarity 
(%) 


75.8 


94.0 


87.1 


71.0 


in 
ch 

00 




93.3 


93.7 




77.9 


51.5 




68.7 


81.6 


70.6 


Identity 
(%) 


57.3 


75.9 


67.7 


64.0 


77.6 




80.3 


81.0 




CM 
CO 


28.0 




37.4 


49.0 


41.8 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv0807 


Corynebacterium 
ammoniagenes ATCC 6872 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF1 


Sulfolobus solfataricus 


Corynebacterium 
ammoniagenes ATCC 6872 
purL 




Corynebacterium 
ammoniagenes ATCC 6872 
purQ 


Corynebacterium 
ammoniagenes ATCC 6872 
purorf 




Lactococcus lactis gpo 


Aeromonas hydrophila JMP636 
nucH 




Mycobacterium tuberculosis 
H37RvRv0784 


Salmonella typhimurium LT2 
dctA 


Pseudomonas sp. W024 dapbl 


db Match 


prr:H70536 


gp:AB003158_2 


1 

00 

in 

CO 

o 
o 
CD 
< 

Ol 


CN 

o' 
<o 
Oi 
CO 

T— 

(f) 
lO 

'■ CL 

0 ^ 


CN 
CO 

CO 

o 
o 

CD 
< 

Id. 




1 

CN 
CO 

CO 

o 
o 

CO 

< 

CL 
CI 


gp:AB003162_1 




< 

cn 

CM 

m 
o 

CM 
CM 

t: 


prf:2216389A 




pir:C70709 


>- 
h- 
_J 
< 
CO 

^' 

CJ 
Q 

icL 

Vi 


< 

CO 
CO 

CO 

o 

CM 
CI 


u 


CO 


1017 


r- 


CO 
oo 


2286 


o 

CM 


C3) 
CO 
(O 


CO 
CM 


CM 
CN 

in 


^ 


00 
CM 


CO 
CM 


00 

CO 


1338 


2118i 


Terminal 
(nt) 


2747683 


2749111 


2749162 


2752103 


2750027 


2763121 


CM 
fO 
CM 

in 

CM 


2752995 


o 

00 
CO 

in 

CM 


2753328 


2756739 


2757126 


2757129 


2757863 


2759532 


Initial 
(nt) 


2748057 


2748095 


2749902 


2751918 


2752312 


2752402 


2752995 


2753237 


2753298 ' 


2753804 


2753992 


2756851 : 


2757815. 


2759200 


2761649: 


SEQ 

' NO. 
1 (a a ) 


6343 


6344 


6345 


6346 


6347 


6348! 


6349 


6350 


5351 


6352 


6353 


6354 


6355 


6356 
63"57 


1 >^ 9 ^ 

w 2 Q 


2843 


CD 


in 

CO 
CN 


(O 

oo 

CN 


I CN 


2848 
2849 


2850 


2851 


2852 


2853 


2854 


2855 
2856 


2857 1 
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C 



Function 


! 


5'-phosphoribosyl-4-N- 
succinocarboxamitje-5-amino 
imidazole synthetase 


adenylosuccino lyase 


aspartate aminotransferase 


5'-phosphoribosylglycinamide 
synthetase 


histidine triad <HIT) family protein 


i 


hypothetical protein 


di-/tripeptide transpoter 


adenosylmethionine-8-amino-7- 
oxononanoate aminotransferase or 
7,8-diaminopelargonic acid 
aminotransferase 


dethiobiotin synthetase 


two-component system sensor 
histidine kinase 


two-component system regulatory 
protein 


transcriptionaf activator 1 


metal-activated pyridoxal enzyme or 
low specificity D-Thr aldolase 


Matched 
length 

(a.a) 




oi 

CM 




in 
cn 

CO 


in 

CN 


to 

CO 
t— 




CO 
(N 


CD 


CO 
CN 


Si 


in 

CO 
CO 


T- 

co 

CM 


0) 


CN 
00 
CO 


Similarity 
(%) 




89 1 


o 

if) 

O) 


62.3 


86.4 


80.2 




56.4 


67.6 


98.8 


99.6 


70.5 


72.7 


in 
ai 

CO 


53.9 


Identity 
(%) 




70.1 


CO 

ih 

00 


28.1 


71.1 


53.7 




CO 

CO 
CN 


30.1 


95.7 


98.7 


31.3 


42.0 


CO 


30.9 


Homologous gene 




Corynebacterium 
ammoniagenes ATCC 6872 
purC 


Corynebacterium 
ammoniagenes ATCC 6872 
purB 


Sulfolobus solfataricus ATCC 
49255 


Corynebacterium 
ammoniagenes ATCC 6872 
purD 


Mycobacterium leprae u296a 




Methanosardna barkeri orf3 


Lactococcus lactis subsp. lactis 
dipT 


Corynebacterium glutamicum 
{Brevibacterium flavum) MJ233 
bioA 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
bioD 


Lactococcus lactis M71plasmid 
pND306 


Thermotoga mantima drrA 


StreptomycGS Irvidans tipA 


S ; 
§ 1 

cL 
1/1 

ro 
O 

< 


db Match 




£ 

CO 

o 
o 
m 
< 


CN 
^1 

CD 

T— 

CO 

o 
o 

CD 
< 

icL 
cn 


O 
(/) 
-1 

(/) 

1 

VI 


£ 

CO 

o 
o 

CD 

< 

CL 
U) 


LU 

o 
> 

X 

>- 

icL 
(/) 




pir:S62195 


3 

3 
^' 

a. 

□ 
io. 
</) 


sp:BIOA_CORGL 


-J 
O 
cc 
O 
o 

1 

Q 
O 
CD 

id. 
\n 


D 

00 

CD 

o 
u. 

< 

CL 

o> 


prt:2222216A 


_j 
on 

in 

<' 
a 

H 

ioL 
(/> 


< 

o 
m 

CO 
CN 

t: 
cx 


ORF 


CN 
CO 


o> 

CO 


1428 


1158 


CO 

CO ■ 

CN 

t- 




in 

CO 


CO 

in 
r»- 


1356 


1269 


CN 
CO 


1455 


in 
o 


CO 

in 


1140 


ra 

C ^ 

is. 

1- 


2761829 


2761785 


2763504 


2764978 


2766158 


2767993 


2767703 


CO 
CO 

CN 


2769156 


2771982 


2772660 


CD 
CN 

r>- 


2774110 


2774937 


2775740 


Initial 
(nt) 


2762452 


2762675 


2764931 


2766135 


o 

CN 

CD 
CN 


2767580 


2768137 


2769095 
2770511 


2770714 


2771989 


2774098 


00 

CN 


2775689 


2776879 


SEQ 
NO. 
(a.a) 


6358 


6359 


6350 
6351 


5352 


6363 


6364 


6365 


6366 


6367 


6368 


6369 


6370 
16371 


6372 


W Z Q 


2856 


2859 

2860 
2861 


2862 


2863 


2864 i 


2865 1 


2866 
2867 


2868 


2869 


2870 
2871 


2872 
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0) 



Function 


pyruvate oxidase 


multidrug efflux protein 


transcriptional regulator 


hypothetical membrane protein 




3-ketosteroid dehydrogenase 


transcriptional regulator, LysR family 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical membrane protein 


transcription initiation factor sigma 


trehalose-6-phosphate synthase 




trehalose-phosphatase 


glucose-resistance amylase 
regulator 


high-affinity zinc uptake system 
protein 


Matched 
length 
(a.a) 


in 


o 
in 


CN 
C3) 


CN 




o 
ro 


CN 
CO 
OJ 


00 
N. 
CN 


CO 
CO 
CN 

! 




o 


CD 


in 
in 


CO 




in 

CN 


CO 


CO 

in 

CO 


Similarity 
(%) 


75.8 


68.9 


68.5 1 


78.4 




62.1 


69.0 


52.9 


55.6 




50.7 


64.0 


50.3 


8 




57.6 


60.2 


CD 


Identity 
(%) 


46.3 


33.3 


30.4 


45.6 




34.3 


T- 

CO 


28.4 


26.7 




28.6 


36.0 


32.3 j 


38.8 




27.4 1 


24.7 


22.4 


Homologous gene 


Escherichia coli K12 poxB 


Staphylococcus aureus plasmid 
pSK23 qacB 


Escherichia coli K12 ycdC 


Mycobacterium tuberculosis . 
H37RV Rv2508c 




Rhodococcus erythropolis SQ1 
kstDI 


Bacillus subtilis 168 alsR 


Mycobacterium tuberculosis 
H37RvRv3298clpqC 


Bacillus subtilis 168 ykrA 




Oryctolagus cuniculus kidney 
cortex rBAT 


Mycobacterium tuberculosis 
H37RV Rv3737 


Streptomyces griseus hrdB 


Schizosaccharomyces pombe 
tpsi 




Escherichia co!iK12otsB 


Bacillus megaterium ccpA 


Haemophilus influenzae Rd 
HI01l9znuA 


db Match 


cd' 

00 
X 

o 

a. 
O 
O 
Ol 

id. 


prf:2212334B 


spYCDC.ECOLI 


pir.D70551 




gp.AF096929_2 


CO 
U 
< 

CD 

1 

cr 
c/) 

>j 
.< 
1 <^ 

in 


pir.C70982 1 


S 

CO 
O) 
CO 

O 
'cx 






CO 

o 

h«- 
CD 
l: 
'5. 


pir:S41307 


O 
a 

X 

o 
</) 
^1 

w 
a 
h- 

CL 

in 




_j 
O 
u 

UJ 

cd' 

U) 
H- 

O 

b. 


LU 

o 
< 

CD 

<• 

a 

u 

u 

gL 


UJ 
< 
X 

<' 

Z 
fsJ 

CL 
(/) 


ORF 
(bp) 




1482 


m 
in 


1320 


CN 
OJ 


o 

CD 
Oi 


in 
o 


ro 

T- 

oo 


CO 

CO 


Oi 

m 


Oi 
Oi 
CO 


1503 


CN 
CO 


1455^ 


CO 

in 


CO 
CD 


o 


CN 

<y> 


Terfninal 
(nt) 


2776768 


2780446 


2780969 


2782315 


2782340 


2784656 


2785651 


2788594 


2788587 | 


2789477 


2790550 


2792448 


2792857 


2794327 


2794812 


2795637 


2795676 


2797806 


Initial 
(nt) 


2778504 


2778965 


2780439 


2780996 


2784481 


2785615 


2786355 


2787782 


2789399 


2789935 


2790152 


2790946 


2792531 : 


2792873 


2794300 


2794870' 


2796749 


2796865 


SEQ 

NO. 
(a.a.) 

6373 


6374 


6375 


6376 


6377 


6378 


6379 


6380 


6381 


6382 


6383 


6384 


6385 


6386 


6387 


6388 


6389 


6390 


1 O A < 
1 CO 2 o 


2873 


2874 


in ! <o 
' r-- 

co ' 00 
OJ j CN 


28771 


2878 


2879 


2880 


2881 
2882 


2883 


CO 

CO 
CN 


2885 

4£CJoD 


o- 

00 
00 
CN 


CO 
! fD 
' CD 
|OJ 


Oi 
CO 
130 
CN 


2890 ' 
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Function 


ABC transporter 


hypothetical membrane protein 


transposase (ISA0963-5) 




3-ketosteroid dehydrogenase 




lipopolysaccharide biosynthesis 
protein or oxidoreduclase or 
dehydrogenase 


dehydrogenase or myo-inosltol 2- 
dehydrogenase 


shikimate transport protein 


shikimate transport protein 


transcriptional regulator 


ribosomal RNA ribose methylase or 
tRNA/rRNA methyltransferase 


cysteinyl-tRNA synthetase 


PTS system, enzyme II sucrose 
protein (sucrose-specific IIABC 
component) 


sucrose 6-phosphate hydrolase or 
sucrase 


glucosamine-6-phosphate 
isomerase 


N-acetylglucosamlne-6-phosphate 
deacetylase 


Matched 
length 
(aa) 


<n 

CM 
CN 


in 

CO 


CO 
O 
CO 




5 

in 




rr 
o 

CM 


CD 
(N 


CN 
O) 
CN 


o 
to 


CN 
CN 


fO 

ro 


T 

to 


CO 
CO 
CD 




00 
CN 


CO 

CD 

ro 


Similarity 


CM 

ro 

CD 


NT 
CO 


52.5 




62.0 




56.4 


59.5 


67.5 


80.8 


55.7 


47.3 


GO 
CO 

to 


77.0 

1 


to 


69.4 


ro 
d 

(0 


identity 
(%) 


31.4 


o 

d 

10 


CO 
CN 




32.1 




34.3 


35.2 


30.5 


ro 


32.6 


22.8 


42.2 


47.0 


35.3 


38.3 


30.2 


Homologous gene 


staphylococcus aureus 8325-4 
mreA 


Mycobacterium tuberculosis 
H37RV Rv2060 


Archaeoglobus fulgidus 




Rhodococcus erythropolis SQ1 
kstD1 




Thermotoga maritima MSB8 
bplA 


Bacillus subtiiis 168 idh or iolG 


Escherichia coli K12 shiA 


Escherichia coliK12shlA 


Streptomyces coelicolor A3(2) 
SC5A7.19C 


Saccharomyces cerevisiae 
YOR201CPET56 


Escherichia coli K12 cysS 


Lactococcus lactis sacB 


Clostridium acetobutylicum 
ATCC 824 sctB 


Escherichia coii K12 nagB j 


Vibrio furnissii SR1514 manD 


db Match 


^1 

CN 
(O 
CN 

u 

< 

CL 
U) 


pir:E70507 


pir:A69426 




gp:AF096929_2 




pir:B72359 


D 

U) 

O 
< 
CD 

1 

D 

CN 

GL 
(/) 


_j 
O 
O 

<• 

X 
(0 

CL 

(/) 


-J 
0 
u 

UJ 

<' 

X 
C/5 

CL 
(/) 


< 
in 
O 
CO 

6. 


(- 

CO 
< 

UJ 

a 

id 
(/} 


sp:SYC_ECOU 


prf:2511335C 


s 
s 

o 

CM 
U- 

< 
d 


O 

u 

UJ 

1 

CD 

2 

id 

VJ 


D 

LL 
CD 

d 


u 


o 

C3) 
CO 


I/) 
cn 
in 


1500 


o 

CN 


1689 




CO 

to 


in 

CO 


in 
vn 

CO 


• to 

CN 
V 


to 


O) 

CO 

O) 


1380 


to 
a: 


1299 


at 
m 


CN 
in 

r- 


Terminal 
(nt) 


2796509 


2799391 


2801034 


2801313 


2801558 


2803250 I 


2804074 ! 


2804676 i 


2805113 


2806016 1 


<J) 

IT) 

8 

00 

1 ^ 


2807426 


2808399 


2809824 


2811960 


2813279 


2814081 


Initial 
(nt) 


2797820 


2798837 


2799535 


2801113 


2803246 


2803996 


2804691 


2805110 1 


2805967 


2806441 


2807252 


2808364 


2809778 


2811806 


2813258 


2814037 


2815232 


SEQ 
NO, 
(aa.) 


6391 


6392 


6393 


6394 


6395 


6396 


6397 


6398 


6399 


o 

§ 

1 ^ 


o 
to 


6402 


ro 
O 

(O 


o 

lO 


6405 


<D 
O 

to 


6407 


W Z Q 


2891 


2892 


2893 


2894 


2895 


o 

CN 


2897 


2898 


2899 ! 


2900 ! 


2901 


2902 j 


2903 1 


2904 


2905 


2906 

1 


2907 
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Function 


dihydrodipicolinate synthase 


glucokinase 


N-acety!mannosamine-6-phosphate 
epimerase 




sialidase precursor 


L-asparaglne permease operon 
repressor 


dipeptide transporter protein or 
heme-binding protein 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


oligopeptide transport ATP-binding 
protein 


homoserine/homoserin lactone 
efflux protein or lysE type 
translocator 


leucine-responsive regulatory 
protein 




hypothetical protein 


hypothetical protein 


transcription factor 


Matched 
length 
(aa) 


298 


(N 

CO 


o 

CN 
CN 




CO, 


CN 
CN 
CN 


o 

CO 

cn 


CN 
CO 


CO 


CO 

in 

CN 


CO 


CN 




CN 

in 


m 

CO 
CN 


m 


Similarity 
(%) 1 


c\t 

(O 


57.6 


CO 

oci 

CO 




50.3 


57.2 


r— 

in 


64.3 


78.3 


78.7 


62.7 


66.2 




86.2 


71.5 


91.1 


Identity 
(%) 


28.2 


28.7 


36.4 




24.8 


25.6 


22.5 


31.9 


46.5 


CO 


m 

00 
CN 


31.0 




55.9 


46.4 


73.3 


Homologous gene 


Escherichia coli K12 dapA 


Streptomyces coelicolor A3{2) 
SC6E10.20C glk 


Clostridium perfringens NCTC 
8798 nanE 




Micromonospora viridifaciens 
ATCC 31146 nadA 


Rhizobium etli ansR 


Bacillus firmus OF4 dppA 


Bacillus firmus OF4 dappB 


Bacillus subtilis 168 oppD 


Lactococcus ladis oppF 


Escherichia coli K12 rhtB 


Bradyrhfzobium japonicum Irp 




Mycobacterium tuberculosis 
H37RV Rv3581c 


Mycobacterium tuberculosis 
H37RV Rv3582c 


Mycobacterium tuberculosis 
H37RV Rv3583c 


db Match 


-J 
O 

o 

UJ 

<' 

< 
o 
a. 


sp:GLK_STRCO 


prf:2516292A 




sp:NANH_MICVI 


^' 

C7> 
^ 

CO 
r— 

< 
Q. 

cn 


CD 

D 
u. 
m 

d. 


sp:DPPB_BACFI 


D 

C/) 
O 
< 

1 q' 

CL 

o 

id 
(/) 


CL 
Q. 

O 
id 
u» 


_j 
O 
o 

UJ 
X 

ct 

a. 

v\ 


prf:2309303A 




pir:C70607 


D 
h- 
O 
> 

h-' 
00 

> 

d. 
u> 


pir:H70803 


u 


n 
o> 


o> 
o 


CO 
(7) 
CO 




1215 


a> 

CN 


1608 


in 
<j) 


1068 


CO 
r— 
CO 


CN 

<o 


CO 
00 


o 

CO 
CO 


o 

CO 


00 
CO 

r-- 


-^r 
o 
in 


Terminal 
(n1) 


2815393 


2817317 


2818058 


2818137 


2818350 


2819557 


2822191 


2823337 


2825341 


2826156 


2826215 


2827404 


2827458 


2827904 


2828379 


2829156 


Initial 
(nt) 


2815458 


2816409 


2817363 


2818313 


2819564 


2820285 


2820584 


2822387 


CN 

CN 
CD 
CN 


2825341 


1 

2826835 | 


2826922 


2827817 


2828383 


2829146' 


2829749 


SEQ 

NO. 
(a.a.) 


6408 


6409 


6410 


6411 


6412 
6413 


! ^ 
1 ^ 


i 

6415 


6416 


6417 


6418 


CO 


6420 


6421 


6422 
6423 


SEQ 
NO. 
(DNA) 


2908 


2909 


2910 


2911 


2912 
2913 


2914 


2915 
2916 
2917 


2918 


2919 


o 

CN 
O) 
CM 


2921 


2922 
2923 
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Function 


two-component system response 
regulator 


two-component system sensor 
[histidine kinase 




DNA repair protein RadA 


hypothetical protein 


hypothetical protein 


p-hydroxybenzaldehyde 
dehydrogenase 


! 


mitochondrial carbonate 
dehydratase beta 


A/G-specific adenine glycosylase 






L-2.3-butanediol dehydrogenase | 








hypothetical protein 


virulence factor 


virulence factor 


Matched 
length 
(a a) 


CO 
CM 


CO 




CO 
CO 


CO 


CO 
CN 






o 

CN 


CO 
CD 
OJ 






258 








o> 






milarity 
(%) 


70.0 


CD 




74.3 


73.3 


53.3 


85.1 




66.2 


70.7 






99.6 








69.1 


63.0 


o 

to 
tn 


CO 








































Identity 
(%) 


43.5 


29.3 




41.5 


40.3 


29.4 


59.5 




36.7 


a> 






99.2 








48.5 


57.0 


64.0 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv3246c mtrA 


Escherichia coli K12 baeS 




Escherichia coli K12 radA 


Bacillus subtilis 168 yacK 


Mycobacterium tuberculosis 
H37RV Rv3587c 


Pseudomonas putida NCIMB 
9866 plasmid pRA4000 




Chlamydomonas reinhardtii cal 


Streptomyces antibiottcus IMRU 
3720 mutY 






Brevi bacterium saccharolyticum 








Mycobacterium tuberculosis 
H37RV RV3592 


Pseuciomonas aeruginosa 
ORF24222 


PseucJomonas aeruginosa 
ORF25110 


db Match 


prf:2214304A 


_j 

O 

o 

LU 

\ 

C/) 
UJ 

< 

CD 

d 
(/} 




_j 
O 
u 

UJ 

<' 

1 


CO 

o 
< 

CO 

o 
< 
> 

d. 


s 

CD 
O 

Q 
u: 
Q. 


1 

CO 
CO 
CO 
CD 

Oi 
' D 
CL 
CL 

Q, 
Ol 




plr:T08204 


1 

CM 

Ui 






1 

CO 

o 

C7) 
O 

o 
CD 
< 

cn 








CN 

tn 
tn 
o 

LU 
*Q. 


CO 
CO 

CNJ 

> 
a 

CO 


GSP:Y29193 


ORF 
(bp) 


(O 


1116 


CN 
00 

in 


1392 


1098 


f«-. 
CO 
CO 


CvJ 

tn 

'ST 




CN 
CD 


<7> 

00 


1155 


CO 

o 

CO 


T 


CN 
CO 




CN 
CO 


(Ji 


o 

CN 


CO ! 

CM 


Terminal 
(nt) 


2830779 


1 2831894 


2832666 


2834181 


2835285 


2835283 


CO 

s 

to 

CO 
CO 
CN 


2837591 


2837956 


2839521 


2840716 


2840758 


CO 

CO 
y— 

00 


2842453 


2843233 


2843716 


CM 
CO 

CO 

1 °° 
CN 


2845558 


2846101 


Initial 
(nt) 


2830057 


2830779 


2832085 


2832790 


2834188 


2835969 


2837499 


2837737 


2838576' 


CO 

to 

(O 
CO 
00 
CN 


2839562 


2841063 


2841075 


2842130 


! CO 
! Ol 

• 

• CM 
OO 


2843405 


2843722 


2845139 


oo 
oo 
m 

1 

00 


SEQ 

NO. 

I (a a.) 


6424 


6425 


6426 


6427 


6428 


6429 


6430 


6431 


6432 


CO 
to 


6434 


6435 


6436 


6437 


6438 


6439 


1 s 

; CO 


6441 


1 s 


SEQ 

NO. 
(DNA) 1 


2924 


2925 


ID 
CM 
O) 
CN 


2927 


oo 

CN 

cn 


2929 


2930 


[2931 


2932 


2933 


2934 


2935 


2936 


2937 


2938 


CD 

S 


O 

1 s 

1 ^ 


2941 


CM 

1 cn 

t CM 



191 



8/17/2007, EAST Version: 2.1.0.14 



EP 1 108 790 A2 



3 

8 



Function 


virulence factor 


CIpC adenosine triphosphatase / 
ATP-binding proteinase 


inosine monophosphate 
dehydrogenase 


transcription factor 


phenol 2-monooxygenase 










lincomycin resistance protein 


hypothetical protein 


1 lysyl-tRNA synthetase 


pantoate-beta-alanine Iigase 
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GTPcycIo hydrolase 1 




cell division protein FtsH 


hypoxanthine 
phosphoribosyltransferase 


cell cycle protein MesJ or cytosine 
deaminase-felated protein 
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hypothetical membrane protein 
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hypothetical protein 
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peptide synthase 




phenylacetaldehyde dehydrogenase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


heat shock protein or chaperon or 
groEL protein 
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Function 




membrane transport protein or 
bicyclomycin resistance protein 


sodium dependent phosphate pump 


phenazine biosynthesis protein 




ABC transporter 


ABC transporter ATP-binding protein 


mutator mutT protein 


hypothetical membrane protein 


glutamine-binding protein precursor 


serine/threonine kinase 




ferredoxin/ferredoxin-NADP 
reductase 


acetyltransf erase (GNAT) family 
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insertion element (1S3 related) 


insertion element (IS3 related) 


two-component system sensor 
histtdine kinase 


transcriptional regulator 




adenylosuccinate synthetase 
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hypothetical protein 
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hypothetical membrane protein 


fructose-bisphosphate aldolase 
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294 804 9 


2949265 


2950431 


2950434 


2952691 


2952972 


2952975 


2954241 


2955523 


2956830 


2957485 


2958139 


2959520 


2960468 


2962730 


2963198 


50 


Initial 
(nt) 


2947591 


2947886 


2949186 


2949882 


2950207 


1 

2951723 


2951933 


2952709 


2954141 


2955272 


2956473 


2957447 


2958036 


2959110 


2960371 


2961187 


2963008 


2963596 




if) ^\ to 


6535 


6536 


6537 


65381 


6539 


6540 


6541 


6542 


CO 

m 

CO 


6544 


6545 i 


6546 


6547 


6548 


6549 


6550 


6551 


55 


to 2 o ■ J;^ 


3035 


3036 


3037 


3038 


3039 


o 
rr 
o 
to 


3041 


! CN 
; ^ 

t o 

1 ^ 


3043 


3044 


3045 


3046 


3047 


ID 

O 
CO 


O) 

li 


3050 


3051 
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c 



03 



Function 


virulence factor 


virulence factor 


virulence factor 


sodium/glutamate symport carrier 
protein 


cadmium resistance protein 


cation efflux system protein 
(zinc/cadmium) 


monooxygenase or oxidoreductase 
or steroid monooxygenase 


alkanal monooxygenase alpha chain 




cystathionine gamma-lyase 


bacterial regulatory protein, lad 
family 


rifampin ADP-ribosyl transferase 


rifampin ADP-ribosyl transferase 


hypothetical protein 


hypothetical protein 


oxidoreductase 




Matched 
length 

(a.a) 


O) 

uo 


o 
o 

CN 


CN 

m 


CO 


00 

o 


CO 
00 
CN 


CO 


<7) 

cr> 

CO 




CO 


CO 


O) 
CO 


CD 

vn 


S 

CO 


O 
CN 


CD 
00 
CO 


Stmltarity 


82.0 


55.0 

i 


63.0 


■ 54.8 


71.3 


63.3 


45.4 


47.4 




62.4 


67.9 


65.2 1 


87.5 


56.2 


64.7 


60.6 


Identity 
(%) 


76.0 


38.0 


o 

CN 
CO 


24.7 


37.0 


23.7 


22.5 


21.1 




36.5 1 


40.2 


O) 


73.2 


.30.5 


33.8 


31.9 


Homologous gene 


Pseudomonas aeruginosa 
ORF24222 


Pseudomonas aeruginosa 
ORF23228 


Pseudomonas aeruginosa 
ORF25110 


Synechocystis sp. PCC6803 
slr0625 


: Staphylococcus aureus cadC 


Pyrococcus abyssi Orsay 
PAB0462 


Rhodococcus rhodochrous 
IF03338 


Kryptophanaron alfredi symblont 
luxA 




Escherichia coli K12 metB 


CN 

.co" 

< 

o 
o 
o 

4> 
O 
U 

«/} 
4> 
O 

e - 

is 

<l> T- 

CO (0 


Streptomyces coelicolor A3(2) 
SCE20.34carr 


Streptomyces coelicolor A3{2) 
SCE20.34carr 


Mycobacterium tuberculosis 
H37Rv Rv0837c 


Mycobacterium tuberculosis 
H37RV Rv0836c | 


Mycobacterium tuberculosis 
H37RV Rv0385 


db Match ' 


GSP:Y29188 


CN 
GO 

<7> 

rs 

>- 
d. 
(/) 
0 


m 

O) 

<ji 

CN 

> 

CL 

o 


pir:S76683 


1 

1- 
w 

I 

LL 

u 

CL 
W 


pir:H75109 


1 

ay 

CO 

o 
o 

a. 


< 
> 

s' 

3 
-J 
id 
</) 




—1 
O 
u 

UJ 

1 

CO 
UJ 
d 


gp:SC1A2_11 


CN 
UJ 

o 

(0 
id 
O) 


CO 

o' 

CN 
UJ 

U 

CO 
d 

O) 


CO 

o 

Ul 
'q. 


pir:D70812 


pir.D70834 


si 


r- 


CN 


CO 

ro 


1347 


r- 

00 
CO 


CO 
CO 


o 


1041 


CN 
CO 
h- 


1146 


r- 

CO 

in 


o 

CN 


CO 
00 


11251 


CM 
CO 

r-. 


1179 




Terminal 
(nt) 


2964434 


2965837 


CO 
ct) 

ir^ 

lO 
CO 

O) 
CN 


2966458 


2968789 


2969808 


2971003 


2972057 


2971338 


2972060 1 


2973230 


2974200 


2974382 


2975591 i 


! 2976360 


r~ 

cn 

CN 




Initial 
(nt) 


2964258 


2965076 


29651BS 


2957804 


2958403 


2968951 


2969834 


2971017 


2972099 


2973205 1 


2973796 


2973961 


o 
o 

CN 

h» 

O) 
CN 


2974467 


2975629 


2976596 






6552 


6553 


6554 


6555 


6556 


6557 


6558 


6559 


6560 


6561 


6562 


6563 


6564 


6565 


6566 


656/ 


CO z o 


3052 


3053 


3054 


3055 


3056 


3057 


j 3058 


3059 


3060 
3061 

3062 


3063 


CO 

. o 

1 " 


3065 


3066 


3067 
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Z3 
C 

c 



03 



Function 


N-carbamoyl-D-amino acid 
amidohydrolase 




hypothetical protein 


novel two-component regulatory 
system 


aldehyde dehydrogenase 


heat shock transcription regulator 


heat shock protein dnaJ 


i nucleotide exchange factor grpE 
protein bound to the ATPase domain 
of the molecular chaperone DnaK 


heat shock protein dnaK 


hypothetical membrane protein 


5*-methylthioadenosine 
nucleosidase and S- 
adenosylhomocysteine nucleosidase 






chromosome segregation protein 






alcohol dehydrogenase 




Matched 
length 
(aa) 1 


in 

( 




O) 
CO 
(N 


00 

o 


o 


wo 

CO 


ro 


fNl 


CO 
CD 


CO 

<o 
n 


uo 

OJ 






1311 






ro 
ro 


Simitarity 
(%) 


67.3 




55.4 


44.0 


90.3 


70.4 


80.1 


66.5 


00 
C7) 
(3) 


79.0 


o 

d 

CD 






48.4 






81.7 


Identity 
(%) 


32.0 




: 28.0 


38.0 


69.6 


XT 


56.7 


38.7 


99.8 


42.6 


27.2 






18.9 






50.0 


Homologous gene 


Methanobacteiium 
thermoautotrophicum Delta H 
MTH1811 




Streptomyces coelicofor A3(2) 
SC4A7.03 


Azosplrillum brasilense carR 


Rhodococcus erythropolis thcA 


Streptomyces atbus G hspR 


Mycobacterium tuberculosis 
H37Rv RV0352 dnaJ 


Streptomyces coelicolor grpE 


Brevibacterium flavum MJ-233 
dnaK 


Streptomyces coelicolor A3(2) 
SCF6.09 


Helicobacter pylori HP0089 mtn 






Schizosaccharomyces pombe 
cut3 






Bacillus stearclhermophilus 
DSM 2334 adh 


db Match 


pir:B69109 




gp:SC4A7_3 


<• 

a: 
o 

OD 
< 

O 


prf;2104333D 


CN 

1 

O) 
(Ji 

rg 
ro 

D 
< 
c/) 
d. 


H- 

o 
>- 

.2 

1 

-o 
< 

Q 

! d 

! i/j 


O 

o 
cr 

w 

1 

UJ 
0. 

tr 

o 

id 
</j 


gsp:R94587 


gp:SCF6_8 


> ■ 

a 

-J 

UJ 
X 

1 

CO 
U- 

a 
d 






sp:CUT3_SCHPO 






sp:ADH2_BACST 




si 


00 


T- 
(N 


1134 


o 

CO 
CO 


1518 


00 

ro 


1185 


CD 
CO 
CD 


1854 


1332 


CO 
CO 

to 


1200 


CD 
00 
OO 


3333 


CD 

ro 
(D 


1485 


1035 




Terminal 
(nt) 


2977847 


2978979 


2980115 


2981216 


2980181 


2982023 


2982495 


29B3887 


2984544 


2988164 


2988214 


2988846 


2992602 


2989954 


2993286 


2993921 


2995747 




Initial 
(nt) 


2978644 


2978737 


2978982 


2980887 


2981698 


2982460 


2983679 


2984522 


2986397 


2986833 


2988846 


2990045 


2991718 


2993286 


2993921 


in 
o 

CO 
C7) 
O) 
CN 


2996781 


1 


SEQ 
NO. 
(aa.) 

6568 


6569 


6570 


6571 


rs 1 r) ; 

in \ i/> \ in 

CO 1 CO 1 CD 


6575 


6576 


6577 


6578 


6579 


6580 


6581 
6582 


ro 

CO 
CD 
CD 
.1_ 


CO 

lO 1 

(£) . 


SEQ 
NO 
(DNA) 

3068 

3069 


3070 


y- CM ro 1 

Q : o ■ O ! O 

to j CO J CO ^ CO 


3075 


CO 


1 3077 


3078 


3079 


3080 


3081 


"3082 


) 

1 3083 


! 

00 

S ' 
^ 1 
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Function 










hypothetical membrane protein | 


hypothetical protein 




sulfate adenylyltransferase, subunit 
1 


sulfate adenylyltransferase small 
chain 


phosphoadenosine phosphosulfate 
reductase 


ferredoxin -nitrate reductase 


ferredoxin/ferredoxin-NADP 
reductase 


huntingtin interador 






alkylphosphonate uptake protein 
and C-P lyase activity 


hypothetical protein 


ammonia monooxygenase 






15 


Matched 
length 
(aa) 










o 

CO 


CN 
CN 






00 

o 

CO 


CN 

T— 

CN 


CN 

o 
m 


r- 

00 

TJ- 








CN 


o 

CO 


CD 


















































20 


Sinailari 
(%) 










70.1 


53.2 




78.3 


70.1 


64.2 


65.5 


61.4 


59.7 






59.9 


CO 
CD 
CO 


76.4 








Identity 
{%) 










43.5 . 


32.5 




47.3 


46.1 


39.2 


in 
CO 


30.8 


32.6 






26.8 


50.0 


39.1 






25 

0) 
=3 
C 

c 
o 
u 

30 ^ 

JS 

35 


Homologous gene 

i 










Bacillus subtilis ytnM 


Streptomyces coelicolor A3(2) 
SC7A8.10C 




Escherichia coli K12 cysN 


Escherichia coli K12 cysD 


Bacillus subtilis cysH 


Synechococcus sp. PCC 7942 


Saccharomyces cerevisiae 
Fl_200 arhl 


Homo sapiens hypE 






Escherichia coli K12 phnB 


Streptomyces coelicolor A3{2) 
SCE68.10 


Pseudomonas putida DSMZ ID 
88-260 amoA 






40 


db Match 










CD 
Oi 
C3) 
CO 
U- 
l: 


o 

CO 

< 

O 
(n 

id. 
cn 




_j 
O 

o 

UJ 

\ 

z 

in 

o 

'd. 
to 


-J 
O 

o 

UJ 

1 

Q 

(/) 
> 
O 

id. 
if) 


D 
c/) 
O 
< 

CD 

1 

I 

b. 

in 


GL 

> 
a' 

2 

iCL 

tn 


sp:ADRO_YEAST 


prf 2420294 J 






-J 
O 
o 
ai 

m 

2 
X 
CL 

a. 

tn 


o 

1 

OO 
CD 

UJ 

o 

c/) 
d 

O) 


i 

< 

CL 
CL 

d. 

U) 




i 




si 


CN 


o 

CN 


CO 


to 

CN 


CN 

o 


n 

tN 


to 

O) 


1299 


tN 
OJ 


CO 
O 
(0 


ro 

00 
lO 


1371 


1083 1 


CO 
CN 


ro 
m 




CD 
CD 
CO 


CN 
CN 

in 


CN 
CO 


CD 
00 


45 


Terminal 
(nl) 


2997366 


2997481 


2997876 


2997963 


2998528 


2999478 


3002426 


3000241 


3001542 


3002453 


3003480 


3006915 


3008376 1 


3008453 


3009303 


3008749 


3009607 


3009710 


3010979 


o 
o 

CO 


50 


initial 
(nt) 


1 

2997151 


2997687 


2997688 


2998223 


2999454 


3000200 


3001512 


1 

3001539 


3002453 


3003145 


3005162 


3005545 


3007294 


3008689 


■3008770 


3009162 


3009242 


3010231 


3010659 


3010926 




SEQ 
NO. 
(a.a ) 


6585 


6586 


6587 


6588 


6589 


6590 


6591 


6592 


6593 


6594 1 


6595 


6596 


6597 


6598 


6599 


6500 


6601 i 


6502 


5603 


Is 


55 


LU ^ ^ 
(/) Z O 


3085 


3086 


30B7 


3088 


3089 


3090 


3091 


3092 


3093 


o 

(O 


3095 


3096 


[3097 ! 


3098 


3099 1 


3100 


3101 


3102 


3103 


' ! 

• O i 
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10 




Function 


hypothetical protein | 




hypothetical protein 


ABC transporter 


ABC transporter j 


metabolite transport protein honiolog| 






succinyl-diaminopimelate 
desuccinylase 








dehydrin-like protein 


maltose/maltodextrin transport ATP- 
binding protein 




cobalt transport protein 


NADPH-flavin oxidoreductase 


inosine-uridine preferring nucleoside 
hydrolase 


hypothetical membrane protein 


DNA-3-methyladenine glycosylase 


flavohemoprolein j 


15 




Matched 
length 
(a.a) 


00 
iO 




n 

CO 


<j> 




CO 






CD 
CD 










CO 

r-. 
ro 




CD 


CO 
CN 


CO 


CO 

1^ 

CN 


CD 
N- 


CD 
O 


20 




Similarity 
(%) 


o 
CO 
in 




CJ) 

in 


64.8 


73.0 


67.8 






48.5 




1 

1 


46.0 


50.1 




67.6 


71.4 


59.3 


CD 

m 


00 

cci 


63.8 1 






Identity 

(%) 1 


41.0 j 




26.1 


in 
CO 


39.3 


30.8 






21.5 








33.0 


24.9 




30.2 


37.2 


c6 

CN 


31.2 


50.3 


33.5 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Agrobacterium vitis ORFZ3 




Alcaligenes eutrophus H16 
ORF7 


Haemophilus influenzae hmcB 


Haemophilus influenzae hmcB 


Bacillus subtilis ydeG 






Escherichia coli K12 msgB 








Daucus carota 


Escherichia coli K12 malK 




Laclococcus lactis Plasmid 
pN24000 Orf-200 cbiM 


Vibrio harveyi MAV frp 


Crithidia fasciculata iunH 


Streptomyces coelicolor A3(2) 
SCE20.08C 


Escherichia coli K12 tag 


Alcaligenes eutrophus HI 6 fhp 


40 




db Match 


> 
O 

> 
CL 
if) 




D 

ui 
u 
_> 
< 

1 

CD 

CD 
> 
id. 
«/» 


CO 

o' 

CD 
CO 
00 
CO 

X 
d 

Q) 


gp:HIU68399_3 


pir:A6977B 






-J 

8 

UJ 

Q. 
< 

Q 

id. 








CN 
< 

O 
Q 

i) 

O r- 


sp:MALK_ECOLI 




in 
oo 

o 
u. 
< 

O) 


|sp.FRP_V(BHA 1 


< 

X 

z 

id. 
in 


CO 

o' 

CN 

at 
O 

CO 

id. 

O) 


-J 

O 

o 

LU 
^1 

5 

ro 

Q. 


D 

UJ 

O 
< 

<' 

0. 

X 

id. 
t/\ 






ORF 
(bp) 


in 

CO 
CN 


(D 

m 


1002 


fO 
C7i 
CO 




1209 


(N 
CM 
OO 


CO 
CO 


ro 

CN 
CO 


1905 




CN 
CD 


in 
CJ} 


1068 


CN 
CO 


oo 

CD 


CD 
CO 


CO 

o 


in 

CD 


CO 

CO 
m 


1158 


45 




Terminal 
(nt) 


3011273 


3011242 


3011808 


3013105 


3013837 


3015824 


3014548 


3016924 


3015827 


3019220 


3018312 j 


3017420 


3018123 


3019542 


3020551 


3021208 


3022113 


3022998 


3025353 


3026139 


CO 
CN 
O 

ro 


50 




Initial 
(nt) 


3010989 


3011805 


3012809 


3013798 


3014550 


3014616 


3015469 


3016238 


3017149 


3017316 


3017539 


13018181 


1 3019076 


3020609 


3021202 


1 

1 3021825 


' 3022928 


3023900 


3024379 


3025552 


3027299 






SEC 

NO. 
(a.a.) 


6605 


9099 


6607 


9099 


6609 


6610 


6611 


I CN 

!5 

CO 


6613 


6614 


6615 


6616 


6617 


6618 


6619 


6620 


6621 


6622 


6623 


CN 
CO 
CO 


16625 


55 




SEQ : 

NO. ; 
(DNA) 


3105 


3106 


1 

3107 


i 3108 


1^ 


3110 


ro 


3112 


3113 


1 

r 


: 3115 


1 3116 


3117 

1 


oo 

fO 


3119 


3120 


1 3121 


3122 


312'3| 


.CO 


; uo 

\^ 
i ro 



201 



8/17/2007, EAST Version: 2.1.0.14 



EP 1 108 790 A2 



C 



Xi 



Function 




oxidoreductase 




transcription antiterminator or beta- 
glucoside positive regulatory protein 




6-phospho-beta-glucosidase 


! 


0) 

v> 
re 
•u 
'io 
O 
u 

iS • 

X) 

6 

x: 
o. 
i/> 
o 

Q. 

d) 


aspartate aminotransferase 




transposase (ISCg2) 


hypothetical membrane protein 




UDP-glucose dehydrogenase 


deoxycytidine triphosphate 
deaminase 




hypothetical protein 




beta-N-Acetylgtucosaminidase 


Matched 
length 




o 

(N 




CN 
O) 

T— 




<o 




CO 
CD 


o 




o 


0) 
O) 
CO 




CN 


GO 
CO 




O) 
(N 
CN 




0 


Sinnilarity 
(%) 




63.8 




CO 
O) 
<D 




59.9 




CO 

GO 


80.9 




100.0 


70.2 




1 72.2 


72.3 




TT 

cri 
m 




58.1 


Identity 
(%) 




00 

ro 




28.1 




43.7 




43.9 


53.7 


100.0 


33.6 




in 
d 


CO 
CO 




30.6 




28.5 


Homologous gene 




Streptomyces coelicolor A3(2) 
mmyQ 




Escherichia coli K1 2 bgIC 




Clostridium longisporum B6405 
abgA 




Clostridium longisporum B6405 
abgA 


Methylobacillus flagellatus aat 




Corynebacterium glutamicum 
ATCC 13032 tnp 


Streptomyces coelicolor A3(2) 
SCQ1 1.10c 




Sinorhizobium meliloti rkpK 


Escherichia coli K12 dcd 




Streptomyces coelicolor A3(2) 
SCC75A.16C 




Streptomyces Ihermoviolaceus 
nagA 


db Match 




gp:SC0276673_18 




_j 
0 
o 

Ui 

o' 

-J 

o 

m 

io. 
tf) 




o 

—1 

U 
<' 

o 

m 
< 

t/i 




sp:ABGA_CLOLO 


CN 

00 

-J 

CL 




o> 

CO 
t— 

< 
b. 


gp;SCQ11J0 




CO 

^ 

CO 

ro 

CN 
CN 

CN 

if 

CL 


sp:DCD_ECOLI 




(D 

<' 

m 

N- 

U 
0 

w 

ioL 




gp:AB008771_1 




CO 

o 

(O 


'<r 

(O 


<o 
m 


Oi 

in 


O) 
CN 


o 
to 

fO 


CO 
CO 


o 

-a- 

(N 


12571 


o 

o 

CO 


1203 


1257 


CO 

00 


1317 


m 


h- 
CO 
CM 


hi 
h- 


1689 


1185 


Terminal 
(nt) 


3028163 


3028891 


3029033 


3028884 


3029782 


3029702 


3030535 


3030101 


3031979 1 


3032348 1 


3033863 


3035437 


3034105 


3035440 


3036845 


3037911 


3038942 


3038993 


CO 

h- 

i ? 

CO 


initial 


3027561 i 


3028268 


3028878 


3029474 


3029504 


3030061 


3030155 


3030340 


3030723 
30326471 


3032661 


3034181 


! 3034287 1 


1 3036756 


1 

3037411 1 


3037675 


13038172 


3040681 


3041932 


SEQ 
NO. 
(a.a.) i 


5526 i 


5527 


6528 


6529 


6530 


6631 


6532 


5533 


6634 


6635 


6636' 


6637 


6638 


15539 


o 

to 
to 


I554I 


6542 

i 


66431 


CD 
CD 


SEQ 
NO. 
(DNA) 


3126 


3127 
3128 


3129 


3130 


3131 


3132 


3133 


3134 


3135 


3136 


3137 
3138" 


3139 


3140 


NT 

to 




CO 

T- 

co 


"sr 

T- 

CO 
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• 

10 


Function 






hypothetical protein 






hypothetical membrane protein 


acyltransferase or macrolide 3-0- 
acyltransferase 




hypothetical membrane protein 




hexosyltransferase 


methyl transferase 


« 

CQ 

c 

•5 

0 
-E 

CO 

0 

CJ 
CO 

1 

CL 

0 
C 
Q) 
0 

f s: 

0 H 


C4-dicarboxyIate transporter 


hypothetical protein j 


hypothetical protein 


mebrane transport protein 




15 


Matched 
length 

(a.a) 






1416 






CO 
CD 
CO 


CO 

0 




CT) 
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Table 1 (continued) 


Homologous gene 






Mycobacterium leprae 
MLCB1883.13C 






Mycobacterium leprae 
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Streptomyces sp. acyA 




Mycobacterium leprae 
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Mycobacterium tuberculosis 
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hypothetical membrane protein 


hypothetical membrane protein 


propionyl-CoA carboxylase complex 
B subunit 


polyketide synthase 


acyl-CoA synthase 


hypothetical protein 




major secreted protein PS1 protein 
precursor 




antigen 85-C 


hypothetical membrane protein 


nodulation protein 
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Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
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dimethylaniline monpoxygenase (N- 
oxide-forming) 




UDP-galactopyranose mutase 


hypothetical protein 


glycerol kinase 


hypothetical protein 


acyltransferase 


seryl-tRNA synthetase 


transcriptional regulator. GntR family 
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Function 


transcriptional regulator 








hypothetical protein 


glucan 1.4-aipha-glucosidase 




glycerophosphoryl diester 
phosphodiesterase 


gluconate penriease 
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pyruvate kinase 


L-lactate dehydrogenase 


hypothetical protein 


hydrolase or haloacid 
dehalogenase-like hydrolase 


efflux protein 


transcription activator or 
transcriptional regulator GntR 


phosphoesterase 


shikimate transport protein 
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Table 1 (continued) 


Homologous gene 


streptomyces coellcolor A3(2) 
SC6G4.33 








Streptomyces lavendulae 
ORF372 


Saccharomyces cerevisiae 
S288C YIR019C stal 




Bacillus subtilis glpO 


Bacillus subtilis gntP 






Corynebacterium glutamicum 
AS019pyk 


Brevibacterium flavum IctA 


Mycobacterium tuberculosis 
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Streptomyces coellcolor A3(2) 
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Brevibacterium linens ORF1 
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Escherichia coll K12 MG1655 
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Function 


L-lactale dehydrogenase or FMN- 
dependent dehydrogenase 




immunity repressor protein | 




phosphatase or reverse 
transcriptase (RNA-dependent) 




peptidase or lAA-amino acid 
hydrolase 




peptide methionine sulfoxide 
reductase 


superoxide dismutase (Fe/Mn) 


transcriptional regulator 


multidrug resistance transporter 








hypothetical protein 


membrane transport protein 


transcriptional regulator 


tvyo-component system response 
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Escherichia coli B msrA 
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Bacillus subtilis gltC 


Corynebacterium glutamicum 
tetA 
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hypothetical protein 
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Escherichia coli K12 MG1655 
tagi 


Mycobacterium tuberculosis 
H37RV Rv2005c 


Escherichia coli K12 MG1655 i 
yhbW 


Chlorobium vibrioforme ybc5 
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Chlamydia muridarum Nigg 
TC0129 




Escherichia coli K12 MG1655 
glcC 
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Mycobacterium tuberculosis 
H37RV Rv2744c 
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Function 












methyltransferase 


nodulin 21-related protein 








transposon tn501 resofvase 




ferredoxin precursor 


hypothetical protein 


transposase 


transposase protein fragment 
TnpNC 




glyceraldehyde-3-phosphate 
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lipoprotein 


copper/pota ssium-tra nsport ing 
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Function 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical protein 






DNA protection during starvation 
protein 


formamidopyrimidine-DNA 
glycosylase 


hypothetical protein 






methylated-DNA-protein-cysteine 
S-methyltransferase 


2inc-binding dehydrogenase or 
quinone oxidoreductase 
(NADPH:quinone reductase) or 
alginate lyase 


! 


membrane transport protein 


malate oxidoreductase [NAD] (malic 
enzyme) 


g/uconok/nase or gluconate kinase 


teicoplanin resistance protein 


1 teicoplanin resistance protein j 
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24.5 1 
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27.0 


Homologous gene 


Escherichia coli K12 MGie55 
ybjZ 


Campylobacter jejuni Cj0606 ( 


Mycobacterium tuberculosis 
H37RV RvO046c 






Escherichia coli K12 dps 


Escherichia coli K12 mutM or 


Escherichia coli K12ncB 






Homo sapiens mgmT 


Cavia porcellus {Guinea pig) qor 




Mycobacterium tuberculosis 
H37RvRv0191 ydeA 


Corynebacterium melassecola 
(Corynebacterium glutamicum) 
ATCC 17966 malE 


Bacillus subtilis gntK 


Enterococcus faecium vanZ 
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Function 


salicylate hydroxylase 


proton/glutamate symporter or 
excitatory amino acid transporter2 


tryptophan-specific permease 


anthranilate synthase component 1 




anthranilate synthase component It 


anthranilate 

phosphoribosyltransferase 


indole-3-glycerol phosphate 
synthase ((GPS) and N-<5'- 
phosphoribosyl) anthranilate 
isomerase(PRAi) 




tryptophan synthase beta chain 


tryptophan synthase alpha chain 


hypothetical membrane protein 


PTS system, II A component or 
unknown pentitof 
phosphotransferase enzyme II. A 
component 


ABC transporter ATP-binding protein 


ABC transporter 
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Homologous gene 
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Corynebacterium glutamicum 
AS019ORF1 


Brevibacterium lactofermentum 
trpE 




Brevibacterium lactofermentum 
trpG 


Corynebacterium glutamicum 
ATCC 21850 IrpD 


Brevibacterium lactofermentum 
trpC 




Brevibacterium lactofermentum 
trpB 


Brevibacterium lactofermentum 
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Streptomyces coelicolor A3(2) 
SCJ21.17C 


Escherichia coli K12 ptxA 


Pseudomonas stutzeri 
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cytchrome b6-F complex iron-sulfur 
subunit (Rieske iron-sulfur protein) 


NADH oxidase or NADH-dependent 
flavin oxidoreductase 


hypothetical membrane protein 


hypothetical protein 


bacterial regulatory protein, arsR 
family or methylenomycin A 
resistance protein 


NADH oxidase or NADH-dependenI 
flavin oxidoreductase 


hypothetical protein 










acetoin(diacetyl) reductase (acetoin 
dehydrogenase) 


hypothetical protein 


di-Aripeptide transpoter 




bacterial regulatory protein. tetR 
family 


hydroxyquinot 1.2-dioxygenase 


15 
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length 
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Streptomyces coelicolor Plasmid 
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Thermoanaerobacter brockii 
nadO 


Saccharomyces cerevisiae 
ymyO 










Klebsiella terrigena budC 


Mycobacterium tuberculosis 
H37RV Rv2094c 


Lactococcus lactis subsp. lactis 
dtpT 




Escherichia coli K12 acrR 


Acinetobacter calcoaceticus 
catA 
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50 


Initial 
(nt) 


3245317 


3246931 ' 


3247234 


CN 

cn 

CO 

00 

CN 
CO 


3249534 


3249651 


3250758 


3251618, 


3251934; 


3252300 


3252636 1 


3252728 


3253560 


3255182 


3255549 


3256298 


3257373 




CO 2: 2- 


6864 


6865 


6866 


6867 


6868 


6869 


6870 


6871 


6872 


6873 


r- 

00 
CO 


6875 


6876 


6877 


6878 


6879 


6880 


55 


SEQ 
NO 
(DNA) 


TT 
CO 
CO 
CO 


3365 


3366 


3367 


3368 


3369 


3370 


[3371 


I 3372 


3373, 


1 3374 


3375 


3376 


3377 


3378 [ 


3379 


3380 



215 



8/17/2007, EAST Version: 2.1.0.14 



EP 1 108 790 A2 



3 



O 



Function 


maleylacetate reductase | 


sugar transporter or Oxylose-proton 
symporter (D-xyiose transporter) 


bacterial transcriptional regulator or 
acetate operon repressor 


oxidoreductase 


diagnostic fragment protein 
sequence 


myo-inositol 2-dehydrogenase | 


dehydrogenase or myo-inositol 2- 
dehydrogenase or streptomycin 
biosynthesis protein 


phosphoesterase | 








stomatin 




DEAD box RNA helicase family 


hypothetical membrane protein | 




phosphomethylpyrimidine kinase 


mercuric ion-binding protein or 
heavy-metal-associated domain 
containing protein 


ectoine/proiine uptake protein 


Matched 
length 
(aa) 


r— 

in 
ro 


CO 
U3 


o 

00 
CM 


in 

CO 


o 

CM 


CM 

ro 
ro 


CO 
CO 


1242 








CD 

o 

CM 




1660 






in 

CM 


CD 


Ol 
CM 


Similarity 
{%) 


75.5 


58.3 


d 

CD 


55.7 


CM 

oc; 
m 


59.6 


62.4 


62.7 1 








57.3 




80.2 


61.0 




76.8 1 


70.1 


62.3 


Identity 
(%) 


43.0 


31.4 


25.7 


27.2 


25.9 


in 

CD 
CM 


34.1 


33.3 








28.6 






34.8 




50.4 1 


46.3 


29.9 


Homologous gene 


Pseudomonas sp. P51 


Escherichia coli K1 2 xylE 


Salmonella typhimurium icIR 


Escherichia coli K12 ydgJ 


Listeria innocua strain 4450 


Sinorhizobium meliloli tdhA 


Streptomyces griseus strl 


Bacillus subtilis yvnB 








Caenorhabditis elegans unci 




Mycobacterium bovis BCG 
RvD1>Rv2024c 


Mycobacterium leprae u2266k 




Bacillus subtilis thiO 


Bacillus subtilis yvgY 


Corynebacterium glutamicum 
proP 


db Match 


sp:TCBF_PSESQ 


sp:XYLE_ECOU 


sp:lCLR_SALTY 


O 
o 

LU 

O 
>- 

a. 
(/) 


gsp:W61761 


3 
C/5 

GO 

o' 

CM 

id. 

Vi 


sp:STRLSTRGR 


pir:C70044 








-J 

LU 
LU 
< 

o 

1 

r- 

o 
z 

D 

d. 

(/) 




CO| 

in 

O 

00 
t— 

O 
CD 

5 

'q. 

O) 


5 

$ 
<n 

(O 

cn 

CO 

CM 

CO 
CM 

t:" 

Q. 




D 
CO 

o 
< 

CD 

o' 

I 

IqL 
t/i 


s 

o 
u. 


< 
in 

O) 
CM 

s 

r 
a 




1089 


Csl 

in 


CD 
GO 


1077 


00 


10051 


1083 1 


40321 


m 

■T 
(O 


<o 

CO 


1086 




Ol 

s 


4929 


o 
in 


o 


o 
o 

CO 


CO 
CM 


CO 
00 


Terminal 
(nt) 1 


3257403 1 


3258661 j 


3261989 


3263221 


3264115 


3265146 


3266266 


3271093 


3267913 


3268618 j 


CM 
(M 

n 


3274488 


3275602 


3276671 


3281666 j 


3283101 


3282347 | 


3283383 


3283473 


Initial 
(nt) 


3258491 1 


3260084 


3261129 


3262145 


3263237 


CM 
^ 

CO 
CM 
CO 


3265184 


3267062 


3268557 


3269235 1 


3271392 


3275231 ! 


3276570 


3281599 


3282172] 


3282742 ] 


CD 
^ 

o 

CM 

to 

CM 
CO 


3283141 


3284309 


SEQ 
NO. 
(aa.) 


6881 


6882 

6883 
6684 


6885 


6886 


6887 


6888 


6899 


6890 
5891 


6892 


6893 


6894 


6895 


6896 


6897 


6898 


6899 


SEQ 
NO. 
(DNA) 


3381 


CM ! 
! CO ; oo 
; n ; CO 
^ CO 1 CO 


3384 


3385 


1 3386 


3387 


00 
00 

CO 

CO 


3389 


Q <r- (NJ 
C71 1 0> 1 O 
CO I CO CO 
CO I fO CO 


j 3393 


3394 


3395 


3396 


3397 j 


3398 


3399 
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Function 




o 

Gl 
JZ 

o 
c 
x 

O 

■o 

0) 

o 

!c 


03 

c 
'c 

o 
£ 

I- 

a> 

O X3 

7 E 
2 ro 






hypothetical protein 


hypothetical protein 


partitioning or sporulatton protein 


glucose inhibited division protein B 


hypothetical membrane protein 


ribonuclease P protein component | 


50S ribosomal protein L34 






L-aspartatealpha-decarboxylase 
precursor 


2-isopropyimalate synthase 


hypothetical protein 


aspartate-semialdehyde 
dehydrogenase 


3-dehydroquinase 


Matched 
length 
(aa) 




0) 


CO 
cn 






CM 
CM 


<D 
CO 


CN 
CN 


CO 

in 


CO 


CO 
CM 








CO 
CO 


CO 


m 

00 


n 




Sinnilarity 
(%) 




76.5 


75.4 






58.5 


60.5 


78.0 


64.7 


75.4 


CO 


93.6 






o 
d 
o 


100.0 


o 
d 
o 


100.0 


O 

d 
o 


Identity 
(%) 




42.0 


51.0 






34.4 


37.6 


65.0 


36.0 


44.7 


26.8 


83.0 1 






100.0 


o 
d 
o 


100.0 


100.0 


100.0 


Homologous gene 




Chlamydomonas reinhardtii thi2 


Bacillus subtilis cwlB 






Mycobacterium tuberculosis 
H37RV Rv3916c 


Pseudomonas putida ygi2 


Mycobacterium tuberculosis 
H37Rv pars 


Escherichia coli K12 gidB 


Mycobacterium tuberculosis 
H37RvRv3921c 


Bacillus subtilis rnpA 


Mycobacterium avium rpmH 






Corynebacterium glutamicum 
panD 


Corynebacterium glutamicum 
ATCC 13032leuA 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 orfX 


Corynebacterium glutamicum 
asd 


Corynebacterium glutamicum 
AS019aroD 


db Match 




UJ 

a. 
-J 

X 
X 

h- 

id 
(/) 


(/) 
O 
< 

m 

_l 

§ 

Ui 






pir:D70851 


■ Z> 

a 

OJ 

CO 

a. 

cm' 

O 
>• 

CL 

tn 


D 
CL 
LU 
CO 
CL 

> 

id 
tn 


"3 
0 
o 

LU 

s' 

o 

id 
tn 


pir:A70852 


o 
< 

CD 

<' 

a 
z 
a 

id 

UI 


CO 

5> 

r- 

< 

id 

O) 






gp:AF116184_1 


sp:LEU1_CORGL 


sp;YLEU_CORGL 


sp:DHAS_CORGL 


■r- 

I 

CO 
CM 

< 

id 


u 


1185 


Ol 
CO 


1242 




1041 


00 
(0 


1152 


CO 
00 


O) 
CO 
CO 


In 
cn 


a> 

O) 

ro 


cn 

CO 


CN 


— 1 

CM 
CM 
CN 


1 

00 

o 


1848 


in 
in 

CM 


1032 




Terminal 
(nt) 


3300119 


3301729 


3302996 


3301989 


3304475 


3302999 


3303636 


3304835 


3305864 


3306682 


3307971 


3308412 


3309321 


3308822 


147573 


266154 


268814 


271691 


446521 


Initial 
(nt) 


3301303 


3301358 


3301755 


3302765 


3303435 


3303616 


3304787 


3305671 


3306532 


3307632 


3308369 


3308747 1 


3309028 1 


3309043 


. 147980 


268001 


269068 


270660 


446075 


c/) Z 5- 


6918 


6919 


6920 


6921 


6922 


6923 


6924 


6925 


6926 


6927; 


6928 


6929 


6930 


6931 


6932 


6933 


6934 


6935 


6936 


SEQ 
NO. 


1 3418 


1 3419 


3420 


3421 


3422 


3423 


3424 


3425 


3426 1 


3427 


1 3428 


1 3429 


3430 


3431 


3432 


3433 


3434 


3435 


3436 
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Function 


elongation factor Tu 


preprotein translocase secY subuit 


isocitrate dehydrogenase 
(oxalosucclnatedecarboxylase) 


acyl-CoA carboxylase or blotin- 
binding protein 


citrate synthase 


putative binding protein or peptidyl- 
prolyi cis-trans isomerase 


glycine betalne transporter 


hypothetical membrane protein 


L-lysine permease 


aromatic amino acid permease 


hypothetical protein 


succinyl diaminopimelate 
desuccinylase 


proline transport system 


arginyMRNA synthetase 


Matched 
length 
(aa) 


CO 
O) 


o 


CO 
CO 


in 


ro 


00 


m 
in 


CO 
CN 
TT 


o 
in 


n 

CO 


CO 
CO 


o 

CO 
CO 


CN 

«n 


0 
in 


>> 






























Similarli 
(%) 


o 

CD 
O 


o 
o 
o 


o 

O 

o 


100.0 


o 
d 
o 


100.0 


100.0 


O 

d 
o 


100.0 


O 
d 
o 


o 
§ 


100.0 


o 
d 
o 


100.0 


Identity 
(%) 


100.O 


o 

O 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


o 

d 
o 


100.0 


100.0 


100.0 


100.0 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13059 tuf 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
secY 


Corynebacterium glutamicum 
ATCC 13032 icd 


Corynebacterium glutamicum 
ATCC 13032 accBC 


Corynebacterium glutamicum 
ATCC 13032gltA 


Corynebacterium glutamicum 
ATCC 13032 fkbA 


Corynebacterium glutamicum 
ATCC 13032 betP 


Corynebacterium glutamicum 
ATCC 13032 orf2 


Corynebacterium glutamicum 
ATCC 13032 lysl 


Corynebacterium glutamicum 
ATCC 13032 aroP 


Corynebacterium glutamicum 
ATCC 13032 orf3 


Corynebacterium glutamicum 
ATCC 13032 dapE 


Corynebacterium glutamicum 
ATCC 13032 putP 


Corynebacterium glutamicum 
AS019 ATCC 13059 argS 


db Match 


sp:EFTU_CORGL 


_] 
O 
oc 
O 

> 
c 

(/i 

CL 

tn 


O 

o 

1 

X 
Q 

bL 
to 


< 

CO 

CN 
CM 
CN 

f 

CL 


.J 
O 

K 
O 
U 

>' 

CO 

O 

ici. 
tn 


sp:FKBP_CORGL 


-J 

O 
OH 

O 
u 

1 

Q. 
H 

UJ 
QQ 
'id. 
</] 


q: 
o 
o 

cn' 
-J 

>- 

d. 

tA 


-J 
(D 
CC 
O 

o 

1 

> 

-J 

Q. 

tn 


sp;AROP_CORGL 


ro 
in 
r>. 

CN 

m 
CO 
l; 
a. 


prf:2 106301 A 


gp:CGPUTP_1 


_j 

QC 
0 
0 

a' 

>- 

CO 
cL 
<n 




1188 


1320 


2214 


1773 


1311 


in 
ro 


1785 


1278 


1503 


1389 


00 
CD 


1107 


1572 


1650 


Terminal 
(nt) 


527563 


570771 


677831 


718580 


879148 


879629 


946780 


1029006 


1030369 


1153295 


1154729 


1156837 


1218031 


1239923 


Initial 
(nt) 


526376 1 


569452 


680044 


720352 


877838 


879276 


944996 


1030283 


1031871 


1154683 


1155676 


1155731 


1219602 


CN 

00 

CO 
CN 


CO Z ra^ 


6937, 


6938 


6939 


6940 


6941 


6942 


6943 


6944 


6945 


CD 

a> 

CD 


6947 


6948 


6949! 


1 

6950 


SEQ 
NO 


3437 


3438 


3439 


3440 


3441 


3442 


3443 


3444 


tn 

CO 


CD 


CO 


00 
CO 


CO 


3450 
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Function 


NADH dehydrogenase 


p h osph orib osyl -ATP- 
pyrophosphohydrolase 


0) 

5^ 

o 

^ 

T3 
O 
O 
>« 

c 
c 


ammonium uptake protein, high 
affinity 


protein-exporl membrane protein 
secG 


phosphoenolpyruvate carboxylase 


chorismate synthase (5- 

enolpyruvylshikimate-3-phosphate 

phospholyase) 


restriction endonuclease 


Sigma factor or RNA polymerase 
transcription factor 


glutamate-binding protein 


recA protein 


dihydrodipicolinate synthase 


dihydrodipicolinate reductase 


L-malale dehydrogenase (acceptor) 


15 


Matched 
length 
(a.a) 


(D 
rr 


GO 


CN 
CD 
CO 


CN 
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CO 
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nt 
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cum 
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fientu 


£ 
o 

U 


)ntlni 


Homologous gen( 


utami 


utami 


utam 


E 
3 


utami 


utami 


utami 


utami 


utam 


utam 


utam! 


utami 
tofern 


ulami 
lofem 


utami 


30 y 
0) 

35 


Corynebaderium gl 
ATCC 13032 ndh 


Corynebacterium gl 
ASOIQhisE 


Corynebacterium gl 
ATCC 13032 ocd 


Corynebacterium gl 
ATCC 13032 amt 


Corynebacterium gl 
ATCC 13032 secG 


Corynebacterium gl 
ATCC 13032 ppc 


Corynebacterium gl 
AS0l9aroC 


Corynebacterium gl 
ATCC 13032 cglllR 


Corynebacterium gi 
ATCC 13869 sigB 


Corynebacterium gl 
ATCC 13032 gluB 


Corynebacterium gl 
AS0l9recA 


Corynebacterium gl 
(Brevibacterium lac 
ATCC 13869 dapA 


Corynebacterium gl 
(Brevibacterium lac 
ATCC 13869 dapB 


Corynebacterium gl 
R127 mqo 


40 


db Match 


CGL238250J 


o 

cx> 
o 
u. 
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CO 
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IT 

I 
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-J 
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o 

<' 
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UJ 
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UJ 

CD 
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DAPB_CORGL 
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U) 
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cn 


d. 
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O) 


nrf 


b. 
Q) 


□ir" 
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CL 


CL 


(J) 


d. 
(/) 
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Ol 
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O) 
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o 


T— 

CO 
CN 


1086 


1356 


CO 
CN 


2757 
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OJ 
r- 


1896 


CO 
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Oi 


m 

00 
00 


1128 


o 


h- 


1500 


45 


Terminal 
(nt) 

t 


ro 
in 


1586466 


1674123 


1675268 


1677049 


1677387 


1719669 


1882385 


2021846 


2061504 


2063989 


2079281 


2081191 


2113864 


50 


1 

Initial 
(nt) 


1544554 


1586725 


1675208 


1676623 


1677279 


1680143 


1720898 


o 
cn 

o 

CO 

00 


2020854 


2060620 


2065116 


2080183 


2081934 


2115363 




SEQ 
NO 

(a.a.) 


6965 


6966 


6967 


6968 


6969 


6970 


6971 


6972 


6973 


i 1^ 
o 

1 ^ 


6976 


6976 


6977 


6978 


55 


SEQ 
NO. 
(DNA) 
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Function 


uridilylyltransferase, uridilylyl- 
removing enzyme 


nitrogen regulatory protein P-ll 


ammonium transporter 


glutamate dehydrogenase (NADl 


pyruvate kinase 


glucokinase 


giutamine synthetase 


threonine synthase 


ectoine/proline/glyclne betaine 
carrier 


malate synthase 


isocitrate lyase 


glutamate 5-kinase 


cystathionine gamma-synthase 


ribonucleotide reductase 


glutaredoxin 


Matched 
length i 
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Example 2 

Determination of effective mutation site 

5 (1) Identification of mutation site based on the comparison of the gene nucleotide sequence of iysine-producing B-6 
strain with that of wild type strain ATCC 13032 

[0374] Corynebacterium glutamicum B-6, which is resistant to S-(2-aminoethyl)cysteine (AEC), rifamplcin, strepto- 
mycin and 6-azauracil, is a Iysine-producing mutant having been mutated and bred by subjecting the wild type ATCC 

10 13032 strain to multiple rounds of random mutagenesis with a mutagen, N-methyl-N' -nitro-N-nitrosoguanidlne (NTG) 
and screening {Appl. Microbiol. Biotechnol., 32: 269-273 (1989)). First, the nucleotide sequences of genes derived 
* from the B-6 strain and considered to relate to the lysine production were determined by a method similar to the above. 
The genes relating to the lysine production Include lysE and /ysG which are lysine-excreting genes; ddh, dapA, horn 
and lysC (encoding diaminopimelate dehydrogenase, dihydropicolinate synthase, homoserlne dehydrogenase and 

IS aspartokinase, respectively) which are lyslne-blosynthetic genes; and pyc and zwf (encoding pyruvate carboxylase 
and glucQse-6-phosphate dehydrogenase, respectively) which are glucose-metabolizing genes. The nucleotide se- 
quences of the genes derived from the production strain were compared with the corresponding nucleotide sequences 
of the ATCC 13032 strain genome represented by SEQ ID N0S:1 to 3501 and analyzed. As a result, mutation points 
were observed In many genes. For example, no mutation site was observed in /ysE, lysG, ddh, dapA, and the lilce, 

20 whereas amino acid replacement mutations were found in horn, lysC, pyc, zwf, and the like. Among these mutation 
points, those which are considered to contribute to the production were extracted on the basis of known biochemical 
or genetic infonnation. Among the mutation points thus extracted, a mutation, Val59Ala, in horn and a mutation, 
Pro458Ser, in pyc were evaluated whether or not the mutations were effective according to the following method. 

25 (2) Evaluation of mutation, Val59Ala, In horn and mutation, Pro458Ser, In pyc 

[0375] It is known that a mutation in horn inducing requirement or partial requirement for homoserine imparts lysine 
productivity to a wild type strain {Amino Acid Fermentation, ed. by Hiroshi Aida eta!., Japan Scientific Societies Press). 
However, the relationship between the mutation, Val59Ala, in ham and lysine production Is not known. It can be ex- 

30 amined whether or not the mutation, Val59Ala, In ham Is an effective mutation by Introducing the mutation to the wild 
type strain and examining the lysine productivity of the resulting strain. On the other hand, it can be examined whether 
or not the mutation, Pro458Ser, in pyc is effective by introducing. this mutation into a Iysine-producing strain which has 
a deregulated lysine-bioxynthetic pathway and is free from the pyc mutation, and comparing the lysine productivity of 
the resulting strain with the parent strain. As such a Iysine-producing bacterium. No. 58 strain (FERM BP-7134) was 

35 selected (hereinafter referred to the "iysine-producing No. 58 strain" or the "No. 58 strain"). Based on the above, it was 
detemrilned that the mutation, Val59Ala, In horn and the mutation, Pro458Ser, in pyc were introduced into the wild type 
strain of Corynebacterium glutamicum ATCC 13032 (hereinafter referred to as the "wild type ATCC 13032 strain" or 
the "ATCC 13032 strain") and the Iysine-producing No. 58 strain, respectively, using the gene replacement method. A 
plasmid vector pCES30 for the gene replacement for the introduction was constructed by the following method. 

40 [0376] A plasmid vector pCE53 having a kanamycin-resistant gene and being capable of autonomously replicating 
In Corynefomn bacteria (MoL Gen. Genet,, 196: 175-178 (1984)) and a plasmid pM0B3 (ATCC 77282) containing a 
levansucrase gene (sacB) of Bacillus subtills (Molecular Microbiology, 6: 1195-1204 (1992)) were each digested with 
Pst\. Then, after agarose gel electrophoresis, a pCE53 fragment and a 2.6 kb DMA fragment containing sacS were 
each extracted and purified using GENECLEAN Kit (manufactured by BIO 1 01). The pCE63 fragment and the 2.6 kb 

45 DNA fragment were ligated using Ligation Kit ver. 2 (manufactured by Takara Shuzo), introduced Into the ATCC 1 3032 
strain by the electroporation method (FEMS Microbiology Letters, 65: 299 (1 989)), and cultured on BYG agar medium 
(medium prepared by adding 1 0 g of glucose, 20 g of peptone (manufactured by Kyokuto Pharmaceutical), 5 g of yeast 
extract (manufactured by Difco), and 16 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its 
pH to 7.2) containing 25 |ig/ml kanamycin at 30*C for 2 days to obtain a transfomnant acquiring kanamycin-resistance. 

50 As a result of digestion analysis with restriction enzymes, it was confirmed that a plasmid extracted from the resulting 
transfonnant by the alkali SDS method had a structure in which the 2.6 kb DNA fragment had been inserted Into the 
Pst\ site of pCE53. This plasmid was named pCES30. 

[0377] Next, two genes having a mutation point, hom and pyc, were amplified by PCR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment Illustrated vol, 3, published by Shujunsha). Specifically, pCES30 
55 was digested with BamHl (manufactured by Takara Shuzo), subjected to an agarose gel electrophoresis, and extracted 
and purified using GENECLEAN Kit (manufactured by BI0 1 01 ). The both ends of the resulting pCES30 fragment were 
blunted with DNA Blunting Kit (manufactured by Takara Shuzo) according to the attached protocol. The blunt-ended 
pCES30 fragment was concentrated by extraction with phenol/chloroform and precipitation with ethanol, and allowed 
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to react in the presence of Taq polymerase (manufactured by Roche Diagnostics) and dTTP at 70'C for 2 hours so 
that a nucleotide, thymine (T), was added to the 3'-end to prepare a T vector of pCES30. 

[0378] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the method 
of Saito et al. {Biochem. Biophys. Acta, 72: 619 (1 963)). Using the chromosomal DNA as a template, PGR was carried 

5 out with Pfu turbo DNA polymelase (manufactured by Stratagene). In the mutated horn gene, the DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were used as the primer set. In the mutated pyc 
gene, the DNAs having the nucleotide sequences represented by SEQ ID NOS:7004 and 7005 were used as the primer 
set. The resulting PGR product was subjected to agarose gel electrophoresis, and extracted and purified using GENE- 
GLEAN Kit (manufactured by 610 101). Then, the PGR product was allowed to react in the presence of Taq polymerase 

10 (manufactured by Roche Diagnostics) and dATP at 72*C for 1 0 minutes so that a nucleotide, adenine (A), was added 
to the 3'-end. 

[0379] The above pGES30 T vector fragment and the mutated horn gene (1 .7 kb) or mutated pyc gene (3.6 kb) to 
which the nucleotide A had been added of the PGR product were concentrated by extraction with phenol/chloroform 
and precipitation with ethanol, and then llgated using Ligation Kit ver. 2. The ligation products were introduced into the 

15 ATGG 13032 Strain according to the electroporation method, and cultured on BYG agar medium containing 25 jig/ml 
kanamycin at 30^*0 for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 ^g/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 
conflmned that the plasmid had a structure in which the 1 .7 kb or 3.6 kb DNA fragment had been inserted into pGES30. 

20 The plasmids thus constructed were named respectively pGhom59 and pGpyc458. 

[0380] The introduction of the mutations to the wild type ATCC 1 3032 strain and the lysine-producing No. 58 strain 
according to the gene replacement method was carried out according to the following method. Specifically, pChom59 
and pCpyc458 were introduced to the ATGG 13032 strain and the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by homologous recombination were selected using the method of 

25 Ikeda etaf. (Microbiology 144: 1863 (1998)). Then, the stains In which the second homologous recombination was 
carried out were selected by a selection method, making use of the fact that the Bacillus subtilis levansucrase encoded 
by pCES30 produced a suicidal substance (J. of Bacterial., 174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyogenes possessed by the ATCC 13032 strain and the No. 58 strain were replaced with 
the mutated horn and pyogenes, respectively, were isolated. The method is specifically explained below. 

30 [0381] One strain was selected from the transfomiants containing the plasmid, pGhom59 or pGpyc458, and the 
selected strain was cultured In BYG medium containing 20 ^g/ml kanamycin, and pCGII (Japanese Published Exam- 
ined Patent Application No. 91827/94) was introduced thereinto by the electroporation method. pCG11 is a plasmid 
vector having a spectinomycin-reslstant gene and a replication origin which is the same as pCE53. After introduction 
of the pGGII, the strain was cultured on BYG agar medium containing 20 ^g/ml kanamycin and 1 00 ^g/ml spectinomycin 

35 at 30*G for 2 days to obtain both the kanamycin- and spectinomycin-reslstant transformant. The chromosome of one 
strain of these transfomiants was examined by the Southern blotting hybridization according to the method reported 
by Ikeda et al. {Microbiology, 144: 1863 (1998)). As a result, it was confinned that pChom59 or pCpyc458 had been 
integrated into the chromosome by the homologous recombination of the Cambell type. In such a strain, the wild type 
and mutated horn or pyc genes are present closely on the chromosome, and the second homologous recombination 

40 Is liable to arise therebetween. 

[0382] Each of these transformants (having been recombined once) was spread on Sue agar medium (medium 
prepared by adding 1 00 g of sucrose, 7 g of meat extract, 1 0 g of peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco), and 18 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its pH 7.2) 
and cultured at 30*0 for a day. Then the colonies thus growing were selected in each case. Since a strain in which the 

45 sacB gene is present converts sucrose Into a suicide substrate, it cannot grow In this medium (J. BacterioL, 174: 5462 
(1 992)). On the other hand, a strain in which the sacB gene was deleted due to the second homologous recombination 
between the wild type and the mutated hom or pyc genes positioned closely to each other forms no suicide substrate 
and, therefore, can grow in this medium. In the homologous recombination, either the wild type gene or the mutated 
gene is deleted together with the sacB gene. When the wild type is deleted together with the sacB gene, the gene 

50 replacement Into the mutated type arises. 

[0383] Chromosomal DNA of each the thus obtained second recombinants was prepared by the above method of 
Salto et al. PGR was carried out using Pfu turbo DNA polymerase (manufactured by Stratagene) and the attached 
buffer. In the hom gene, DNAs having the nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were 
used as the primer set. Also, in the pyc gene was used, DNAs having the nucleotide sequences represented by SEQ 

55 ID NOS:7004 and 7005 were used as the primer set. The nucleotide sequences of the PGR products were determined 
by the conventional method so that it was judged whether the hom or pyc gene of the second recombinant was a wild 
type or a mutant. As a result, the second recombinant which were called HD-1 and No. 58pyc were target strains having 
the mutated h6m gene and pyc gene, respectively. 
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(3) Lysine production test of HD-1 and No. 56pyc strains 

[0384] Tlie HD-1 strain (strain obtained by incorporating the mutation, Val59Aia, in the horn gene Into the ATCC 
1 3032 strain) and the No. SBpyc strain (strain obtained by Incorporating the mutation, Pro458Ser, in the pyc gene into 

5 the lysine-producing No. 58 strain) were subjected to a culture test in a 5 I Jar fermenter by using the ATCC 13032 
strain and the lyslne-producing No. 58 strain respectively as a control. Thus lysine production was examined. 
[0385] After cuituring on BYG agar medium at 30°C for 24 hours, each strain was inoculated into 250 mi of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of corn steep liquor, 8.3 g of ammonium sulfate, 1 g of 
urea, 2 g of potassium dihydrogenphosphate, 0.83 g of magnesium sulfate heptahydrate, 10 mg of iron sulfate hep- 

10 tahydrate, 1 mg of copper sulfate pentahydrate, 1 0 mg of zinc sulfate heptahydrate, 1 0 mg of p-alanine, 5 mg of nicotinic 
acid, 1 .5 mg of thiamin hydrochloride, and 0.5 mg of biotin to 1 liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 1 buffle-attached Erienmeyer flask and cultured therein 
at 30°C for 12 to 1 6 hours. A total amount of the seed cuituring medium was Inoculated Into 1 ,400 ml of a main culture 
medium (medium prepared by adding 60 g of glucose, 20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g of 

15 potassium dihydrogenphosphate, 0,75 g of magnesium sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 13 
mg of manganese sulfate pentahydrate, 50 mg of calcium chloride, 6.3 mg of copper sulfate pentahydrate, 1 .3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride hexahydrate, 1 .3 mg of cobalt chloride hexahydrate, 1 .3 mg of am- 
monium molybdenate tetrahydrate, 14 mg of nicotinic acid, 23 mg of p-alanine, 7 mg of thiamin hydrochloride, and 
0.42 mg of biotin to 1 liter of water) contained in a 5 1 Jar fermenter and cultured therein at 32*'C, 1 wm and 800 rpm 

20 while controlling the pH to 7.0 with aqueous ammonia. When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 400 g glucose and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was carried out at a controlled speed so as to maintain the dis- 
solved oxygen concentration within a range of 0.5 to 3 ppm. After cuituring for 29 hours, the culture was temriinated. 
The ceils were separated from the culture medium by centrifugation and then L-lysine hydrochloride In the supernatant 

25 was quantified by high performance liquid chromatography (IHPLC). The results are shown in Table 2 below. 



Table 2 


Strain 


L-Lyslne hydrochloride yield (g/l) 


ATCC 13032 


0 


HD-1 


8 


No. 58 


45 


No. 58pyc 


51 



[0386] As Is apparent from the results shown in Table 2, the lysine productivity was improved by introducing the 
mutation, Val59Aia, in the horn gene or the mutation, Pro458Ser, in the pyc gene. Accordingly, it was found that the 
mutations are both effective mutations relating to the production of lysine. Strain, AHP-3, in which the mutation, 
Val59Aia, In the horn gene and the mutation, Pro458Ser, In the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331lle In the lysC gene has been deposited on December 5, 2000, in 
National Institute of Bioscience and Human Technology, Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki, Japan) as PERM BP-7382. 

Example 3 

Reconstruction of lysine-producing strain based on genome infomiatlon 

[0387] The lysine-producing mutant B-6 strain {Appl. Microbiol. Bioteclinol., 32-. 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG and screening from the wild type ATCC 13032 strain, 
produces a remaricably large amount of lysine hydrochloride when cultured In a jar at 32°C using glucose as a carbon 
source. However, since thefemnentation period is long, the production rate is less than 2.1 g/l/h. Breeding to reconstitute 
only effective mutations relating to the production of lysine among the estimated at least 300 mutations introduced into 
the B-6 strain in the wild type ATCC 13032 strain was performed. 

(1 ) Identification of mutation point and effective mutation by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 13032 strain 

[0388] As described above, the nucleotide sequences of genes derived from the B-6 strain were compared with the 
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corresponding nucleotide sequences of the ATCC 13032 strain genome represented by SEQ ID N0S:1 to 3501 and 
analyzed to identify many mutation points accumulated in the chromosome of the 6-6 strain. Among these, a mutation, 
Val591Ala, in horn, a mutation, Thr311lle, In lysQ a mutation, Pro458Ser, in pyc and a mutation, Ala213Thr. in zwf 
were specified as effective mutations relating to the production of lysine. Breeding to reconstitute the 4 mutations in 
5 the wild type strain and for constructing of an industrially important iysine-producing strain was carried out according 
to the method shown below. 

(2) Construction of plasmid for gene replacement having mutated gene 

10 [0389] The plasmid for gene replacement, pChomSS, having the mutated horn gene and the plasmid for gene re- 
placement, pCpyc458, having the mutated pyc gene were prepared in the above Example 2(2). Plasmids for gene 
replacement having the mutated /ysC and zwT were produced as described below. 

[0390] The lysC and zivf having mutation points were amplified by PGR, and inserted into a plasmid for gene re- 
placement, pCES30, according to the TA cloning method described in Example 2(2) (Bio Experiment Illustrated, Vol. 3). 
[0391] Separately, chromosomal DNA was prepared from the lyslne-producing B-6 strain according to the above 
method of Saito etal. Using the chromosomal DNA as a template, PGR was carried out with Pf u turbo DNA polymerase 
(manufactured by Stratagene). In the mutated /ysC gene, the DNAs having the nucleotide sequences represented by 
SEQ ID NOS:7006 and 7007 were used as the primer set. In the mutated zwf gene, the DNAs having the nucleotide 
sequences represented by SEQ ID NOS:7008 and 7009 as the primer set. The resulting PGR product was subjected 

20 to agarose gel electrophoresis, and extracted and purified using GENEGLEAN Kit (manufactured by BIO 101). Then, 
the PGR product was allowed to react in the presence of Taq DNA polymerase (manufactured by Roche Diagnostics) 
and dATP at 72''G for 10 minutes so that a nucleotide, adenine (A), was added to the 3'-end. 
[0392] The above pGES30 T vector fragment and the mutated lysC gene (1 .5 l<b) or mutated zwf gene (2.3 kb) to 
which the nucleotide A had been added of the PGR product were concentrated by extraction with phenol/chlorofomn 

25 and precipitation with ethanol, and then ligated using Ligation Kit ver. 2. The ligation products were introduced into the 
ATCG 13032 strain according to the electroporation method, and cultured on BYG agar medium containing 25 \ig/m\ 
kanamycin at 30*G for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 ^g/ml kanamycin, and a plasmid was extracted from the cuituring 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 

30 conf imied that the plasmid had a structure in which the 1 .5 kb or 2.3 kb DNA fragment had been inserted into pGES30. 
The plasmids thus constructed were named respectively pGlysC311 and pCzwf213. 

(3) Introduction of mutation, Thr3111le, In lysC mio one point mutant HD-1 

35 [0393] Since the one mutation point mutant HD-1 in which the mutation, Vai59Ala, In horn was Introduced into the 
wild type ATGG 1 3032 strain had been obtained in Example 2(2), the mutation, Thr311 lie, in lysC was introduced into 
the HD-1 strain using pGlysG311 produced in the above (2) according to the gene replacement method described In 
Example 2(2). PGR was carried out using chromosomal DNA of the resulting strain and, as the primer set, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 7007 In the same manner as in Example 2(2). As a 

40 result of the fact that the nucleotide sequence of the PGR product was detemilned In the usual manner, it was confinned 
that the strain which was named AH D-2 was a two point mutant having the mutated lysC gene in addition to the mutated 
horn gene. 

(4) Introduction of mutation, Pro458Ser, in pyc into two point mutant AHD-2 

45 

[0394J The mutation, Pro458Ser, In pyc was Introduced into the AHD-2 strain using the pCpyc458 produced In Ex- 
ample 2(2) by the gene replacement method described in Example 2(2). PGR was can'led out using chromosomal 
DNA of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID 
NOS:7004 and 7005 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of 
50 the PGR product was detemiined in the usual manner, It was confinned that the strain which was named AHD-3 was 
a three point mutant having the mutated pyc gene In addition to the mutated horn gene and lysC gene. 

(5) Introduction of mutation, Ala213Thr, in zwf Into three point mutant AHP-3 

55 [0395] The mutation, Ala213Thr, in zivf was introduced into the AHP-3 strain using the pGzwf458 produced in the 
above (2) by the gene replacement method described In Example 2(2). PGR was carried out using chromosomal DNA 
of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID NOS: 
7008 and 7009 in the same manner as In Example 2(2). As a result of the fact that the nucleotide sequence of the PGR 
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product was determined in the usual manner, it was confinned that the strain which was named APZ-4 was a four point 
mutant having the mutated zivfgene in addition to the mutated horn gene, lysC gene and pyc gene. 

(6) Lysine production test on HD-1 , AHD-2, AHP-3 and APZ-4 strains 

5 

[0396] The HD-1 , AHD-2, AHP-3 and AP2-4 strains obtained above were subjected to a culture test In a 5 I jar 
femnenter in accordance with the method of Example 2(3). 
[0397] Table 3 shows the results. 

10 Table 3 



Strain 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


HD-1 


8 


0.3 


AHD-2 


73 


2.5 


AHP-3 


80 


2.8 


APZ-4 


86 


3.0 



[0398] Since the lysine-produclng mutant B-6 strain which has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the AP2-4 strain showing a high productivity of 3.0 g/l/h Is useful in Industry. 

20 

(7) Lysine femrientation by APZ-4 strain at high temperature 

[0399] The APZ-4 strain, which had been reconstructed by Introducing 4 effective mutations into the wild type strain, 
was subjected to the culturing test in a 5 1 jar fennenter In the same manner as in Example 2(3) , except that the culturing 
temperature was changed to 40'C. 
. [0400] The results are shown in Table 4. 



Table 4 



Temperature C^C) 


L-Lysine hydrochloride (g/i) 


Productivity (g/l/h) 


32 


86 


3.0 


40 


95 


3.3 



[0401] As is apparent from the results shown in Table 4, the lysine hydrochloride titer and productivity in culturing at 
5^ a high temperature of 40'C comparable to those at 32'*C were obtained. In the mutated and bred lyslne-producing B- 
6 strain constructed by repeating random mutation and selection, the growth and the lysine productivity are lowered 
at temperatures exceeding 34^*0 so that lysine femnentation cannot be carried out, whereas lysine fermentation can 
be carried out using the APZ-4 strain at a high temperature of 40**C so that the load of cooling Is greatly reduced and 
it is industrially useful. The lysine fermentation at high temperatures can be achieved by reflecting the high temperature 
adaptability inherently possessed by the wild type strain on the APZ-4 strain. 

[0402] As demonstrated In the reconstruction of the lysine-producing strain, the present Invention provides a novel 
breeding method effective for eliminating the problems In the conventional mutants and acquiring industrially advan- 
tageous strains. This methodology which reconstitutes the production strain by reconstituting the effective mutation is 
an approach which is efficiently carried out using the nucleotide sequence infomnation of the genome disclosed in the 
present Invention, and Its effectiveness was found for the first time In the present Invention. 

Example 4 

Production of DNA microarray and use thereof 

50 

[0403] A DNA microarray was produced based on the nucleotide sequence Information of the ORF deduced from 
the full nucleotide sequences of Corynebacterium glutamicum ATCC 13032 using software, and genes of which ex- 
pression Is fluctuated depending on the carbon source during culturing were searched. 

(1 ) Production of DNA microarray 

[0404] Chromosomal DNA was prepared from Corynebacterium glutamicum ATCC 1 3032 by the method of Salto et 
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al. ( Biochem. Biophys. Acta, 72. 619 (1963)). Based on 24 genes having the nucleotide sequences represented by 
SEQ ID NOS:207, 3433, 281 , 3435, 3439, 765, 3445, 1226, 1229, 3448, 3451 , 3453, 3455, 1743, 3470, 2132, 3476, 
3477, 3485, 3488, 3489, 3494, 3496, and 3497 from the ORFs shown in Table 1 deduced frorh the full genome nucle- 
otide sequence of Corynebacterium glutamicum ATCC 13032 using software and the nucleotide sequence of rabbit 
5 giobin gene (GenBank Accession No. V00882) used as an Internal standard, oligo DNA primers for PCR ampiiflcation 
represented by SEQ iD NOS:7010 to 7059 targeting the nucleotide sequences of the genes were synthesized in a 
usual manner. 

[0405] As the oilgo DNA primers used for the PCR, 
' [0406] DNAs having the nucleotide sequence represented by SEQ ID NOS:701 0 and 701 1 were used for the ampli- 
10 fication of the DNA having the nucleotide sequence represented by SEQ ID NO:207, 

[0407] DNAs having the nucleotide sequence represented by SEQ ID NOS:7012 and 7013 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3433, 

[0408] DNAs having the nucleotide sequence represented by SEQ ID NOS:7014 and 7015 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:281 , 
15 [0409] DNAs having the nucleotide sequence represented by SEQ ID NOS:7016 and 7017 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3435, 

[0410] DNAs having the nucleotide sequence represented by SEQ ID NOS:701B and 7019 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3439, 

[0411] DNAs having the nucleotide sequence represented by SEQ ID NOS:7020 and 7021 were used for the am- 
20 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:765, 

[0412] DNAs having the nucleotide sequence represented by SEQ ID NOS:7022 and 7023 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3445, 

[0413] DNAs having the nucleotide sequence represented by SEQ ID NOS:7024 and 7025 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1226, 
25 [0414] DNAs having the nucleotide sequence represented by SEQ ID NOS:7026 and 7027 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO: 1229, 

[0415] DNAs having the nucleotide sequence represented by SEQ ID NOS:7028 and 7029 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3448, 

[0416] DNAs having the nucleotide sequence represented by SEQ ID NOS:7030 and 7031 were used for the am- 
30 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3451 , 

[0417] DNAs having the nucleotide sequence represented by SEQ ID NOS:7032 and 7033 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3453, 

[0418] DNAs having the nucleotide sequence represented by SEQ ID NOS:7034 and 7035 were used for the am- 
plification of the DNA having the nucleotide sequence represented.by SEQ ID NO:3455, 
35 ' [0419] DNAs having the nucleotide sequence represented by SEQ ID NOS:7036 and 7037 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1743, 

[0420] DNAs having the nucleotide sequence represented by SEQ ID NOS:7038 and 7039 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3470, 

[0421] DNAs having the nucleotide sequence represented by SEQ ID NOS:7040 and 7041 were used for the am- 
40 plification of the DNA having the nucleotide sequence represented by SEQ ID NO;2132, 

[0422] DNAs having the nucleotide sequence represented by SEQ ID NOS:7042 and 7043 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3476, 

[0423] DNAs having the nucleotide sequence represented by SEQ ID NOS:7044 and 7045 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3477, 
45 [0424] DNAs having the nucleotide sequence represented by SEQ ID NOS:7046 and 7047 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NG:3485, 

[0425] DNAs having the nucleotide sequence represented by SEQ ID NOS:7048 and 7049 were used for the am- 
. plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3488, 

[0426] DNAs having the nucleotide sequence represented by SEQ ID NOS:7050 and 7051 were used for the am- 
50 plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3489, 

[0427] DNAs having the nucleotide sequence represented by SEQ ID NOS:7052 and 7053 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3494, 

[0428] DNAs having the nucleotide sequence represented by SEQ ID NOS:7054 and 7055 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3496, 
55 [0429] DNAs having the nucleotide sequence represented by SEQ ID NOS:7056 and 7057 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3497, and 
[0430] DNAs having the nucleotide sequence represented by SEQ ID NOS:7058 and 7059 were used for the am- 
plification of the DNA having the nucleotide sequence of the rabbit giobin gene. 
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as the respective primer set. 

[0431] The PGR was carried for 30 cycles with each cycle consisting of 15 seconds at 95**C and 3 minutes at 68*C 
using a thermal cycler (GeneAmp PCR system 9600, manufactured by Perkin Elmer), TaKaRa EX-Taq (manufactured 
by Takara Shuzo), 1 00 ng of the chromosomal DNA and the buffer attached to the TaKaRa Ex-Taq reagent. In the case 

5 of the rabbit globin gene, a single-stranded cDNA which had been synthesized from rabbit globin mRNA (manufactured 
by Life Technologies) according to the manufacture's Instructions using a reverse transcriptase RAV-2 (manufactured 
by Takara Shuzo). The PCR product of each gene thus amplified was subjected to agarose gel electrophoresis and 
extracted and purified using QIAquIck Gel Extraction Kit (manufactured by QIAGEN). The purified PCR product was 
concentrated by precipitating it with ethanol and adjusted to a concentration of 200 ng/p.L Each PCR product was 

10 spotted on a slide glass plate (manufactured by Matsunami Glass) having MAS coating In 2 runs using GTMASS 
SYSTEM (manufactured by Nippon Laser & Electronics Lab.) according to the manufacture's instructions. 

(2) Synthesis of fluorescence labeled cDNA 

15 [0432] The ATCC 13032 strain was spread on BY agar medium (medium prepared by adding 20 g of peptone (man- 
ufactured by Kyokuto Pharmaceutical), 5 g of yeast extract (manufactured by Difco), and 1 6 g of Bactoagar (manufac- 
tured by Difco) to In 1 liter of water and adjusting its pH to 7.2) and cultured at SO'^C for 2 days. Then, the cultured 
strain was further Inoculated Into 5 ml of BY liquid medium and cultured at 30**C overnight. Then, the cultured strain 
was further inoculated Into 30 ml of a minimum medium (medium prepared by adding 5 g of ammonium sulfate, 5 g of 

20 urea, 0.5 g of monopotassium dihydrogenphosphate, 0.5 g of dipotassium monohydrogenphosphate, 20.9 g of mor- 
pholinopropanesuifonic acid, 0.25 g of magnesium sulfate heptahydrate, 10 mg of calcium chloride dihydrate, 10 mg 
of manganese sulfate monohydrate, 1 0 mg of ferrous sulfate heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2 mg 
copper sulfate, and 0.2 mg biotin to 1 liter of water, and adjusting its pH to 6.5) containing 110 mmol/l glucose or 200 
mmol/l ammonium acetate, and cultured in an Erlenmyer flask at 30^ to give 1 .0 of absorbance at 660 nm. After the 

25 cells were prepared by centrifuging at 4'C and 5,000 rpm for 1 0 minutes, total RNA was prepared from the resulting 
cells according to the method of Bonnann etaL ( Molecular Microbiology, &. 317-326 (1992)). To avoid contamination 
with DNA, the RNA was treated with Dnasel (manufactured by Takara Shuzo) at 37*C for 30 minutes and then further 
purified using Qiagen RNeasy MinlKIt (manufactured by QIAGEN) according to the manufacture's instructions. To 30 
^ig of the resulting total RNA, 0.6 ^il of rabbit globin mRNA (50 ng/jil, manufactured by Life Technologies) and 1 ^il of 

30 a random 6 mer primer (500 ng/^l, manufactured by Takara Shuzo) were added for denaturing at 65**C for 1 0 minutes, 
followed by quenching on Ice. To the resulting solution, 6 |il of a buffer attached to Superscript II (manufactured by 
Lifetechnologles), 3 nl of 0.1 mol/l DTT, 1 .5 ^il of dNTPs (25 mmol/l dATP, 25 mmol/l dCTP, 25 mmol/l dGTP, 10 mmol/ 
I dTTP), 1 .5 ^1 of Cy5-dUTP or Cy3-dUTP (manufactured by NEN) and 2 ^1 of Superscript II were added, and allowed 
to stand at 25°C for 10 minutes and then at 42*'C for 110 minutes. The RNA extracted from the cells using glucose as 

35 the carbon source and the RNA extracted from the cells using ammonium acetate were labeled with Cy5-dUTP and 
Cy3-dUTP, respectively. After the fluorescence labeling reaction, the RNA was digested by adding 1.5 |il of 1 mol/l 
sodium hydroxide-20 mmol/l EDTA solution and 3.0 \i\ of 10% SDS solution, and allowed to stand at 65*C for 10 
minutes. The two cDNA solutions after the labeling were mixed and purified using Qiagen PCR purification Kit (man- 
ufactured by QIAGEN) according to the manufacture's instructions to give a volume of 1 0 ^1. 

40 

(3) Hybridization 

[0433] UltraHyb (1 1 0 p.1) (manufactured by Ambion) and the fluorescence-labeled cDNA solution (^0\i\) were mixed 
and subjected to hybridization and the subsequent washing of slide glass using GeneTAC Hybridization Station (man- 
45 ufactured by Genomic Solutions) according to the manufacture's instructions. The hybridization was candied out at 
50'C, and the washing was carried out at 25*C. 

(4) Fluorescence analysis 

50 [0434] The fluorescence amount of each DNA array having the fluorescent cDNA hybridized therewith was measured 
using ScanArray 4000 (manufactured by GSI Lumonlcs). 

[0435] Table 5 shows the Cy3 and Cy5 signal intensities of the genes having been connected on the basis of the data 
of the rabbit globin used as the internal standard and the Cy3/Cy5 ratios. 

55 Table 5 



SEQ ID NO 


Cy3 intensity 


Cy5 Intensity 


Cy3/Cy5 


207 


5248 


3240 


1.62 
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Table 5 (continued) 



5 



10 



15 



20 



25 



SEQ ID NO 


Cy3 Intensity 


Cy5 Intensity 


Cy3/Cy5 


3433 


2239 


2694 


0.83 


281 


2370 


2595 


0.91 


3435 


2566 


2515 


1.02 


3439 


5597 


6944 


0.81 


765 


6134 


4943 


1.24 


3455 


1169 


1284 


0.91 


1226 


1301 


1493 


0.87 


1229 


1168 


1131 


1.03 


3448 


1187 


1594 


0.74 


3451 


2845 


3859 


0.74 


3453 


3498 


1705 


2.05 


3455 


1491 


1144 


1 .30 


1 743 


1972 


1841 


1.07 


3470 


4752 


3764 


1 .26 


2132 


1173 


1085 


1.08 


3476 


1847 


1420 


1.30 


3477 


1284 


1164 


1.10 


3485 


4539 


8014 


0.57 


3488 


34289 


1398 


24.52 


3489 


43645 


1497 


29.16 


3494 


3199 


2503 


1.28 


3496 


3428 


2364 


1,45 


3497 


3848 . 


3358 


1.15 



[0436] The ORF function data estimated by using software were searched for SEQ ID NOS:3488 and 3489 showing 
remarkably strong Cy3 signals. As a result, It was found that SEQ ID NOS:3488 and 3489 are a maleate synthase 
gene and an isocitrate lyase gene, respectively, it Is known that these genes are transcriptionally induced by acetic 
acid in Corynebacterium glutamicum {Archives of Microbiology, 168: 262*269 (1997)). 

[0437] As described above, a gene of which expression Is fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence Infomnation deduced from the full genomic nucleotide 
sequence infomriatlon of Corynebacterium glutamicum PJCC 13032 using software, amplifying the nucleotide sequenc- 
es of the gene using the genome DNA of Corynebacterium glutamicum as a teniplate in the PGR reaction, and thus 
producing and using a DNA microarray. 

[0438] This Example shows that the expression amount can be analyzed using a DNA microarray In the 24 genes. 
On the other hand, the present DNA microarray techniques make it possible to prepare DNA microan-ays having thereon 
several thousand gene probes at once. Accordingly, It is also possible to prepare DNA microarrays having thereon all 
of the ORF gene probes deduced from the full genomic nucleotide sequence of Corynebacterium glutamicum 00 
13032 detemnlned by the present invention, and analyze the expression profile at the total gene level of Corynebac- 
fenum p/ufam/cum using these arays. 

Example 5 

Homology search using Corynebacterium glutamicum qQX\omQ sequence 
(1 ) Search of adenosine deaminase 

[0439] The amino acid sequence (ADD^ECOLI) of Escherictiia co// adenosine deaminase was obtained from Swlss- 
prot Database as the amino acid sequence of the protein of which function had been confinned as adenosine deaminase 
(EC3.5.4.4). By using the full length of this amino acid sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of Corynebacterium glutamicum or a database of the amino 
acids in the ORF region deduced from the genome sequence using FASTA program {Proc. Natl, Acad. Sci. ISA, 85: 
2444-2448 (1 988)). A case where E-value was ie'^° or less was judged as being significantly homologous. As a result, 
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no sequence significantly homologous with the Escherichia coll adenosine deaminase was found in thie nucleotide 
sequence database of the genome sequence of Corynebacterium glutamicum or the database of the amino acid se- 
quences in the ORF region deduced from the genome sequence. Based on these results, it is assumed that Coryne- 
bacterium glutamicum contains no ORF having adenosine deaminase activity and thus has no activity of converting 
5 adenosine Into Inosine. 

(2) Search of glycine cleavage enzyme 

[0440] The sequences (GCSP_ECOLI, GCST_ECOLI and GCSH_ECOLi) of glycine decarboxylase, aminomethyl 
10 transferase and an aminomethyl group carrier each of which is a component of Escherichia coli glycine cleavage 
enzyme as the amino acid sequence of the protein, of which function had been confimied as glycine cleavage enzyme 
(EC2.1 ,2.1 0), were obtained from Swiss-prot Database. 

[0441 ] By using these full-length amino acid sequences as a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterium giutamicum or a database of the ORF amino acid 

IS sequences deduced from the genome sequence using FASTA program. A case where E-value was ie-'io or less was 
judged as being significantly homologous. As a result, no sequence significantly homologous with the glycine decar- 
boxylase, the aminomethyl transferase or the aminomethyl group carrier each of which is a component of Escherichia 
CO// glycine cleavage enzyme, was found in the nucleotide sequence database of the genome sequence of Coryne- 
bacterium giutamicum orthe database of the ORF amino acid sequences estimated from the genome sequence. Based 

20 on these results. It is assumed that Corynebacterium glutamicum contains no ORF having the activity of glycine de- 
carboxylase, aminomethyl transferase or the aminomethyl group carrier and thus has no activity of the glycine cleavage 
enzyme. 

(3) Search of iiVIP dehydrogenase 

25 

[0442] The amino acid sequence (IMDH ECOLI) of Escherichia co// IMP dehydrogenase as the amino acid sequence 
of the protein, of which function had been confirmed as IMP dehydrogenase (EC1 .1.1 .205), was obtained from Swiss- 
prot Database. By using the full length of this amino acid sequence as a query, a homology search was carried out on 
a nucleotide sequence database of the genome sequence of Corynebacterium glutamicum or a database of the ORF 
amino acid sequences predicted from the genome sequence using FASTA program. A case where E-value was le-^o 
or less was Judged as being significantly homologous. As a result, the amino acid sequences encoded by two ORFs, 
namely, an ORF positioned in the region of the nucleotide sequence No. 61 5336 to 61 6853 (or ORF having the nucle- 
otide sequence represented by SEQ ID NO:672) and another ORF positioned in the region of the nucleotide sequence 
No. 616973 to 618094 (or ORF having the nucleotide sequence represented by SEQ ID NO:674) were significantly 
homologous with the ORFs of Escherichia coli IMP dehydrogenase. By using the above-described predicted amino 
acid sequence as a query In order to examine the similarity of the amino acid sequences encoded by the ORFs with 
IMP dehydrogenases of other organisms In greater detail, a search was carried out on GenBank (http://www.ncbi.nlm. 
nih.gov/) nr-aa database (amino acid sequence database constructed on the basis of GenBankCDS translation prod- 
ucts, PDB database, Swiss-Prot database, PIR database, PRF database by eliminating duplicated registrations) using 
BLAST program. As a result, both of the two amino acid sequences showed significant homologies with IMP dehdy- 
rogenases of other organisms and clearly higher homologies with IMP dehdyrogenases than with amino acid sequences 
of other proteins, and thus, it was assumed that the two ORFs would function as IMP dehydrogenase. Based on these 
results, it was therefore assumed that Corynebacterium glutamicum has two ORFs having the IMP dehydrogenase 
activity. 

Example 6 

Proteome analysis of proteins derived from Corynebacterium glutamicum 

50 (1 ) Preparations of proteins derived from Corynebacterium giutamicum ATCC 1 3032, PERM BP-71 34 and PERM BP- 
158 

[0443] Gulturing tests of Corynebacterium giutamicum ATCC 1 3032 (wild type strain), Corynebacterium glutamicum 
FERM BP-7134 (lysine-produclng strain) and Corynebacterium glutamicum (PERM BP-158, lysine-highly producing 
55 strain) were carried out in a 5 1 jar fermenter according to the method in Example 2(3). The results are shown in Table 6. 
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Table 6 



5 



Strain 


L-Lysine yield (g/l) 


ATCC 13032 


0 


PERM BP-7134 


45 


PERM BP-158 


60 



[0444] After cutturing, cells of each strain were recovered by centrifugatlon. Tliese celts were washed with TrIs-HCi 
10 buffer (1 0 mmol/ITrls-HCI, pH 6.5, 1 ,6 mg/ml protease Inhibitor (COMPLETE; manufactured by Boehringer Mannheim)) 
three times to give washed cells which could be stored under freezing at -80'*C. The freeze-stored cells were thawed 
before use, and used as washed cells. 

[0445] The washed celts described above were suspended In a disruption buffer (1 0 mmol/l Tris-HCI, plH 7.4, 5 mmol/ 
I magnesium chloride, 50 mg/l RNase, 1 .6 mg/ml protease Inhibitor (COMPLETE: manufactured by Boehringer Man- 
15 nhelm)), and disrupted with a disnjptor (manufactured by Brown) under cooling. To the resulting disruption solution, 
DNase was added to give a concentration of 50 mg/l, and allowed to stand on Ice for 10 minutes. The solution was ' 
centrifuged (5,000 x g, 15 minutes, 4**C) to remove the undisrupted cells as the precipitate, and the supernatant was 
recovered. 

[0446] To the supernatant, urea was added to give a concentration of 9 mol/l, and an equivalent amount of a lysis 
20 buffer (9.5 mot/I urea, 2% NP-40, 2% Ampholine, 5% mercaptoethanol, 1.6 mg/ml protease Inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, followed by thoroughly stinging. at room temperature for 
dissolving. 

[0447] After being dissolved, the solution was centrifuged at 12,000 x g for 15 minutes, and the supernatant was 
recovered. 

25 [0448] To the supernatant, ammonium sulfate was added to the extent of 80% saturation, followed by thoroughly 
stirring for dissolving. 

[0449] After being dissolved, the solution was centrifuged (16,000 x g, 20 minutes, 4°C), and the precipitate was 
recovered. This precipitate was dissolved in the lysis buffer again and used In the subsequent procedures as a protein 
sample. The protein concentration of this sample was determined by the method for quantifying protein of Bradford. 

30 

(2) Separation of protein by two dimensional electrophoresis 

[0450] The first dimensional electrophoresis was carried out as described below by the isoelectric electrophoresis 
method. 

35 [0451] A molded dry IPG strip gel (pH 4-7, 13 cm, Immoblline DryStrips; manufactured by Amersham Pharmacia 
Biotech) was set In an electrophoretic apparatus (Multiphor II or IPGphor; manufactured by Amersham Phannacia 
Biotech) and a swelling solution (8 mol/l urea, 0.5% Triton X-100, 0.6% dithlothreitol, 0.5% Ampholine, pH 3-10) was 
packed therein, and the gel was allowed to stand for swelling 1 2 to 1 6 hours. 

[0452] The protein sample prepared above was dissolved in a sample solution (9 mol/l urea, 2% CHAPS, 1% dithi- 
"^0 othreitol, 2% Ampholine, ptH 3-1 0), and then about 1 00 to 500 ^g (In temis of protein) portions thereof were taken and 

added to the swollen IPG strip gel. 

[0453] The electrophoresis was carried out in the 4 steps as defined below under controlling the temperature to 20*C: 

step 1 : 1 hour under a gradient mode of 0 to 500V; 
^5 step 2: 1 hour under a gradient mode of 500 to 1 ,000 V; 

step 3: 4 hours under a gradient mode of 1 ,000 to 8,000 V; and 
step 4: 1 hour at a constant voltage of 8,000 V. 

[0454] After the Isoelectric electrophoresis, the IPG strip gel was put off from the holder and soaked in an equilibration 
50 buffer A (50 mmol/l Tris-HCI, pH 6.8, 30% glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and another equili- 
bration buffer B (50 mmol/l Tris-HCI, pH 6.8, 6 mot/I urea, 30% glycerol, 1% SDS, 0.45% iodo acetamide) for 15 minutes 
to sufficiently equilibrate the gel. 

[0455] After the equilibrium, the IPG strip gel was lightly rinsed in an SDS electrophoresis buffer (1 .4% glycine, 0.1% 
SDS, 0.3% Tris-HCI, pH 8.5), and the second dimensional electrophoresis depending on molecular weight was carried 
55 out as described below to separate the proteins. 

[0456] Specifically, the above I PG strip gel was closely placed on 1 4% polyacry lamlde stub gel (1 4% polyacrylamide, 
0.37% blsacryiamide, 37.5 mmol/l Tris-HCI, pH 8.8, 0.1% SDS, 0.1% TEMED, 0.1% ammonium persulfate) and sub- 
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jected to electrophoresis under a constant voltage of 30 mA at 20°C for 3 hours to separate the proteins. 

(3) Detection of protein spot 

5 [0457] Coomassie staining was performed by the method of Gorg et al . (Electrophoresis, &. 531 -546 (1 988)) for the 

slub gel after the second dimensional electrophoresis. Specifically, the slub gel was stained under shal<ing at 25^*0 for 
about 3 hours, the excessive coloration was removed with a decoloring solution, and the gel was thoroughly washed 
with distilled water. 

[0458] The results are shown in Fig. 2. The proteins derived from the ATCC 13032 strain (Fig. 2A), FERM BP-7134 
^0 strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be separated and detected as spots. 

(4) In-gel digestion of detected protein spot 

[0459] The detected spots were each cut out from the gel and transfen-ed into siliconized tube, and 400 (li of 100 
15 mmol/1 ammonium bicarbonate : acetonitrile solution (1:1, v/v) was added thereto, followed by shaking overnight and 
freeze-dried as such. To the dried gel, 1 0 p.l of a lysylendopeptidase (LysC) solution (manufactured by WAKO, prepared 
with 0.1 % SDS-containing 50 mmol/l ammonium bicarbonate to give a concentration of 1 00 ng/|xl) was added and the 
gel was allowed to stand for swelling at O'C for 45 minutes, and then allowed to stand at 37^*0 for 16 hours. After 
removing the LysC solution, 20 [il of an extracting solution (a mixture of 60% acetonitrile and 5% formic acid) was 
20 added, followed by ultrasonication at room temperature for 5 minutes to disrupt the gel. After the disruption, the extract 
was recovered by centrifugation (12,000 rpm, 5 minutes, room temperature). This operation was repeated twice to 
recover the whole extract. The recovered extract was concentrated by centrifugation m vacuoXo halve the liquid volume. 
To the concentrate, 20 |il of 0.1% trifluoroacetic acid was added, followed by thoroughly stirring, and the mixture was 
subjected to desalting using ZipTIp (manufactured by Mllllpore). The protein absorbed on the carriers of ZIpTip was 
25 eluted with 5 \i\ of a-cyano-4-hydroxyclnnamic acid for use as a sample solution for analysis. 

(5) Mass spectrometry and amino acid sequence analysis of protein spot with matrix assisted laser desorption ionization 
time of flight mass spectrometer (MALDI-TOFMS) 

30 [0460] The sample solution for analysis was mixed in the equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/l Angiotensin II, 300 nmol/I Neurotensin, 150 nmol/l ACTHcllp 18-39, 2.3 ^imol/i bovine 
Insulin B chain), and 1 p.1 of the obtained solution was spotted on a stainless probe and crystallized by spontaneously 
drying. 

[0461] As measurement instruments, REFLEX MALDI-TOF mass spectrometer (manufactured by Bruker) and an 
35 N2 laser (337 nm) were used In combination. 

[0462] The analysis by PMF (peptide-mass finger printing) was earned out using Integration spectra data obtained 
by measuring 30 times at an accelerated voltage of 19.0 kV and a detector voltage of 1 .50 kV under reflector mode 
conditions. Mass calibration was carried out by the internal standard method. 

[0463] The PSD (post-source decay) analysis was carried out using integration spectra obtained by successively 
40 altering the reflection voltage and the detector voltage at an accelerated voltage of 27.5 kV. 

[0464] The masses and amino acid sequences of the peptide fragments derived from the protein spot after digestion 
were thus determined. 

(6) Identification of protein spot 

45 

[0465] From the amino acid sequence Information of the digested peptide fragments derived from the protein spot 
obtained In the above (5), ORFs con^esponding to the protein were searched on the genome sequence database of 
Corynebacterium glutamicum fi^VCC 13032 as constructed in Example 1 to identify the protein. 
[0466] The identification of the protein was can'ied out using MS-FIt program and MS-Tag program of intranet protein 
50 prospector. 

(a) Search and Identification of gene encoding high-expression protein 

[0467] In the proteins derived from Corynebacterium glutamicum ATCC 1 3032 showing high expression amounts in 
55 CBB-staining shown in Fig. 2A, the proteins corresponding to Spots-1 ,2,3,4 and 5 were identified by the above method. 
[0468] As a result, it was found that Spot-1 corresponded to enolase which was a protein having the amino acid 
sequence of SEQ ID NO:4585; Spot-2 corresponded to phosphogiycelate kinase which was a protein having the amino 
acid sequence of SEQ ID NO:5254; Spot-3 corresponded to glyceraldehyde-3-phosphate dehydrogenase which was 
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a protein having the amino acid sequence represented by SEQ ID NO:5255; Spot-4 corresponded to fructose. bis- 
phosphate aldolase which was a protein having the annino acid sequence represented by SEQ ID NO:6543; and Spot- 
5 corresponded to triose phosphate isomerase which was a protein having the amino acid sequence represented by 
SEQ ID NO:6252. 

s [0469] These genes, represented by SEQ ID NOS:1085, 1754, 1775, 3043 and 1752 encoding the proteins corre- 
sponding to Spots-1 , 2, 3, 4 and 5, respectively, encoding the l<nown proteins are Important in the central metabolic 
pathway for maintaining the life of the microorganism. Particularly, it is suggested that the genes of Spots-2, 3 and 5 
form an operon and a high-expression promoter is encoded in the upstream thereof (J. of EacterioL, 174\ 6067-6086 
(1992)). 

10 [0470] Also, the protein corresponding to Spot-9 in Fig. 2 was identified in the same manner as described above, 
and it was found that Spot-9 was an elongation factor Tu which was a protein having the amino acid sequence repre- 
sented by SEQ ID No:6937, and that the protein was encoded by DNA having the nucleotide sequence represented 
by SEQ ID No:3437. 

[0471] Based on these results, the proteins having high expression level were Identified by proteome analysis using 
IS the genome sequence database of Corynebacterlum glutamicum constructed In Example 1 . Thus, the nucleotide se- 
quences of the genes encoding the proteins and the nucleotide sequences upstream thereof could be searched simul- 
taneously. Accordingly, it is shown that nucleotide sequences having a function as a high-expression promoter can be 
efficiently selected. 

20 (b) Search and Identification of modified protein 

[0472] Among the proteins derived from Corynebacterium glutamicum PERM BP-7134 shown in Fig. 28, Spots-6, 
7 and 8 were identified by the above method. As a result, these three spots ail corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ ID NO:3786. 
25 [0473] Accordingly, all of Spots-6, 7 and 8 detected as spots differing In Isoelectric mobility were all products derived 
from a catalase gene having the nucleotide sequence represented by SEQ ID No:285. Accordingly, it is shown that 
the catalase derived from Corynebacterium glutamicum PERM BP-7134 was modified after the translation. 
[0474] Based on these results, It is confimned that various modified proteins can be efficiently searched by proteome 
analysis using the genome sequence database of Corynebacterium glutamicum constructed in Example 1 . 

30 

(c) Search and identification of expressed protein effective in lysine production 

[0475] It was found out that in Fig. 2A (ATCC 13032: wild type strain), Pig. 2B (PERM BP-7134: lysine-producing 
strain) and Fig. 2C (PERM BP-1 58: lyslne-hlghly producing strain), the catalase conresponding to Spot-8 and the elon- 
35 gatlon factor Tu corresponding to Spot-9 as identified above showed the higher expression level with an increase in 
the lysine productivity. 

[0476] Based on these results, it was found that hopeful mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a target product by the proteome analysis using the genome 
sequence database of Corynebacterium glutamicum consXruciedi In Example 1. 
40 [0477] . Moreover, useful mutation points of useful niutants can be easily specified by searching the nucleotide se- 
quences (nucleotide sequences of promoter, ORP, or the like) relating to the Identified proteins using the above data- 
base and using primers designed on the basis of the sequences. As a result of the fact that the mutation points are 
specified, industrially useful mutants which have the useful mutations or other useful mutations derived therefrom can 
be easily bred. 

45 [0478] While the invention has been described In detail and with reference to specific embodiments thereof, It will 
be apparent to one of skill in the art that various changes and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited herein are incorporated In their entirety. 



50 Claims 

1. A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 
55 (B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a corynefonn bacterium, 
. (D) analyzing expression patterns of genes derived from a corynefonn bacterium, or 
(E) identifying a gene homologous to a gene derived from a coryneform bacterium. 
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said method comprising: 

(a) producing a poiynucieotlde array by adhering to a solid support at ieast two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 

s one of SEQ ID N0S:1 to 3501 , second polynucleotides which hybridize with the first polynucleotides under 

stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of ' 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
ryneform bacterium, a labeled polynucleotide derived from a mutant of the corynefonm bacterium or a 

10 labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

The method according to claim 1 , wherein the coryneform bacterium is a microorganism belonging to the genus 
CoryneBacterium, the genus Brevibacterium, or the genus Microbacterium. 

The method according to claim 2, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophHum, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 
um melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

The method according to claim 1 , wherein the polynucleotide derived from a corynefonn bacterium, the polynuce- 
lotide derived from a mutant of the corynefonn bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. 

The method according to claim 1 , wherein the polynucleotide to be examined is derived from Escherichia coii. 

A polynucleotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID N0S:1 to 3501 , second polynucleotides which hybridize 
with the first polynucleotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 
tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

A polynucleotide comprising the nucleotide sequence represented by SEQ ID N0:1 or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID N0S:2 to 3431, or a 
polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931 , or a polynucleotide which hybridizes therewith under stringent conditions. 

10. A polynucleotide which Is present in the 5' upstream or 3' downstream of a poiynucieotlde comprising the nucleotide 
sequence of any one of SEQ ID N0S:2 to 3431 in a whole polynucleotide comprising the nucleotide sequence 
represented by SEQ ID N0:1, and has an activity of regulating an expression of the polynucleotide. 

50 _ 11. A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of any 
one of claims 7 to 1 0, or a poiynucieotlde comprising a nucleotide sequence complementary to the polynucleotide 
comprising 1 0 to 200 continuous based. 

12. A recombinant DNA comprising the polynucleotide of any one of claims 8 to 11 . 

55 

13. A transformant comprising the polynucleotide of any one of claims 8 to 11 or the recombinant DNA of claim 12. 

14. A method for producing a polypeptide, comprising: 
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culturing the transfomnant of claim 1 3 in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of claim 8 or 9 In the medium, and 
recovering the polypeptide from the medium. 

5 15. A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof, comprising: 

culturing the transfomiant of claim 13 in a medium to produce and accumulate at least one of an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
10 recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 

and analogues thereof from the medium. 

A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID N0S:2 to 
3431 . 

A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931 . 

18. The polypeptide according to claim 16 or 17, wherein at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which Is substantially the same as that of the polypeptide without said 

20 at least one amino acid deletion, replacement, insertion or addition. 

19. A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid sequence 
of the polypeptide of claim 1 6 or 1 7, and having an activity which is substantially the same as that of the polypeptide. 

25 20. An antibody which recognizes the polypeptide of any one of claims 1 6 to 1 9. 

21. A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of claims 1 6 to 1 9 and 
30 partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

22. A polypeptide array, comprising: 

35 at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 

tides of claims 1 6 to 1 9 and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

23. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
40 form bacterium, comprising the following: 

(I) a user input device that inputs at least one nucleotide sequence infomiation selected from SEQ ID N0S:1 
to 3501 , and target sequence or target structure motif information; 
(ii) a data storage device for at least temporarily storing the input infonnation; 
45 (lii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 

1 to 3501 with the target sequence or target structure motif infonnation, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which Is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator 

50 

24. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
fonri bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID N0S:1 to 3501, target se- 
55 quence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said infonnation; 

(ill) comparing the at least one nucleotide sequence information selected from SEQ ID N0S:1 to 3501 with 
the target sequence or target structure motif Information; and 
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(iv) screening and analyzing nucleotide sequence information which Is coincident with or analogous to the 
target sequence or target structure motif Information. 

25. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne* 
5 forni bacterium, comprising the following: 

(i) a user Input device that inputs at least one amino acid sequence Information selected from SEQ ID NOS: 
3502 to 7001 , and target sequence or target structure motif Information; 

(ii) a data storage device for at least temporarily storing the input infonnatlon; 

^0 (iii) a comparator that compares the at least one amino acid sequence Infonnatlon selected from SEQ ID NOS: 

3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

15 

26. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(1) Inputting at least one amino acid sequence Infonnation selected from SEQ ID NOS:3502 to 7001 , and target 
20 sequence information or target structure motif Information into a user Input device; 

(ii) at least temporarily storing said infonnation; 

(III) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif Infonnatlon; and 

(Iv) screening and analyzing amino acid sequence infonnatlon which is coincident with or analogous to the 
25 target sequence or target structure motif Information. 

27. A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having a 
target nucleotide sequence derived from a corynefonn bacterium, comprising the following: 

30 (I) a user Input device that inputs at least one nucleotide sequence infonnatlon selected from SEQ ID N0S:2 

to 3501, function Information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(ii) a data storage device for at least temporarily storing the input information; 

(ill) a comparator that compares the at least one nucleotide sequence infonnatlon selected from SEQ ID NOS: 
^5 2 to 3501 with the target nucleotide sequence Information for determining a function of a polypeptide encoded 

by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID N0S:2 to 3501 ; and 
(iv) an output devices that shows a function obtained by the comparator. 

40 28. A method based on a computer for detemnining a function of a polypeptide encoded by a polypeptide encoded by 
a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the fol- 
lowing: 

(i) inputting at least one nucleotide sequence infonnatlon selected from SEQ ID N0S:2 to 3501 , function in- 
^5 fomaation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence Infonnatlon; 

(ii) at least temporarily storing said infonnatlon; 

(iii) comparing the at least one nucleotide sequence infonnation selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
^0 which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 

from SEQ ID N0S:2 to 3501. 

29. A system based on a computer for detemnining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

55 

(1) a user Input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function infonnation based on the amino acid sequence, and target amino acid sequence Infor- 
mation; 
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(ii) a data storing device for at least temporarily storing the input information; 

(lii) a comparator that compares the at least one amino acid sequence Information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence Information for determining a function of a polypeptide 
having the target amino acid sequence which Is coincident with or analogous to the polypeptide having at least 
5 one amino acid sequence selected from SEQ ID NOS:3502 to 7001 ; and 

(iv) an output device that shows a function obtained by the comparator. 

30. A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(1) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , function 
information based on the amino acid sequence, and target amino acid sequence infonnation; 
(il) at least temporarily storing said infonnation; 

(ill) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target amino acid sequence infonnation; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 

20 31 . The system according to any one of claims 23, 25, 27 and 29, wherein a corynefonn bacterium Is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

32. The method according to any one of claims 24, 26, 28 and 30, wherein a corynefonn bacterium is a microorganism 
of the genus Corynebacterlum, the genus Brevibacterium, or the genus Microbacterium. 

25 

33. The system according to claim 31 , wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Coryr)ebacterium giutamlcum, Corynebacterium acetoacidophiium, Corynebacterium 
acetoglutamicum, Corynebacterium cailunae, Corynebacterium hercuiis, Corynebacterium iiiium, Corynebacteri- 
um meiassecoia, Corynebacterium ttiermoaminogenes, and Corynebacterium ammoniagenes. 

30 

34. The method according to claim 32, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium giutamicum, Corynebacterium acetoacidophiium, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium Iiiium, Corynebacteri- 
um melassecota, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

35 

35. A recording medium or storage device which Is readable by a computer in which at least one nucleotide sequence 
infonnation selected from SEQ ID N0S:1 to 3501 or function information based on the nucleotide sequence is 
recorded, and is usable in the system of claim 23 or 27 or the method of claim 24 or 28. 

40 36. A recording medium or storage device which is readable by a computer In which at least one amino acid sequence 
Information selected from SEQ ID NOS:3502 to 7001 or function infomiation based on the amino acid sequence 
is recorded, and is usable In the system of claim 25 or 29 or the method of claim 26 or 30. 

37. The recording medium or storage device according to claim 35 or 36, which is a computer readable recording 
45 medium selected f rorn the group consisting of a floppy disc, a hard disc, a magnetic tape, a random access memory 

(RAM), a read only memory (ROM), a magneto-optic disc (MO), CD-ROM, CD-R, CD-RW. DVD-ROM, DVD-RAM 
and DVD-RW. 

38. A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the Val 
50 residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a corynefonn bacterium 

is replaced with an amino acid residue other than a Val residue. 

39. A polypeptide comprising an amino acid sequence in which the Vat residue at the 59th position In the amino acid 
sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val residue. 

55 

40. The polypeptide according to claim 38 or 39, wherein the Val residue at the 59th position is replaced with an Ala 
residue. 
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41. A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro residue 
at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro residue. 

5 42. A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position In the amino acid 
sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

43. The polypeptide according to claim 41 or 42, wherein the Pro residue at the 458th position Is replaced with a Ser 
residue. 

10 

44. The polypeptide according to any one of claims 38 to 43, which is derived from Corynebacterium glutamicum. 

45. A DN A encoding the polypeptide of any one of claims 38 to 44. 
15 46. A recombinant DNA comprising the DNA of claim 45. 

47. A transformant comprising the recombinant DNA of claim 46. 

48. A transformant comprising In its chromosome the DNA of claim 45. 

20 

49. The transformant according to claim 47 or 48, which is derived from a corynefomri bacterium. 

50. The transformant according to claim 49, which is derived from Corynebacterium glutamicum, 

25 51 , A method for producing L-tysine, comprising: 

culturing the transformant of any one of claims 47 to 50 in a medium to produce and accumulate L-lysine in 
the medium, and 

recovering the L-lyslne from the culture. 

30 

52. A method for breeding a coryneform bacterium using the nucleotide. sequence information represented by SEQ 
ID N0S:1 to 3431 , comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by afemnentatlon 
method, with a corresponding nucleotide sequence in SEQ ID N0S:1 to 3431 ; 
(il) identifying a mutation point present in the production strain based on a result obtained by (j); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point, or deleting 
the mutation point from a corynefonn bacterium having the mutation point; and 

(iv) examining productivity by the fennentatlon method of the compound selected in (1) of the corynefonn 
bacterium obtained in (Iii). 

53. The method according to claim 52, wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
45 a signal transmission pathway. 

54. The method according to claim 52, wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

50 55. A method for breading a corynefonn bacterium using the nucleotide sequence information represented by SEQ 
ID N0S:1 to 3431 , comprising: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
55 an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 

method, with a corresponding nucleotide sequence in SEQ ID N0S:1 to 3431 ; 
(11) Identifying a mutation point present In the production strain based on a result obtain by (1); 
(Hi) deleting a mutation point from a corynefonn bacterium having the mutation point; and 



35 
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(iv) examining productivity by the fermentation method of the compound selected in (i) of the cotynefonn 
bacterium obtained In (lii). 

56. The method according to claim 55, wherein the gene Is a gene encoding an enzyme in a blosynthetic pathway or 
5 a signal transmission pathway. 

57. The method according to claim 55, wherein the mutation point is a mutation point which decreases or destabilizes 
the productivity. 

10 58. A method for breeding a corynefomn bacterium using the nucleotide sequence Infonnatlon represented by SEQ 
ID N0S:2 to 3431 , comprising the following: 

(I) Identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
1^ quence information represented by SEQ ID NOS:2 to 3431 ; 

(ii) classifying the Isozyme Identified In (i) into an isozyme having the same activity; 

(Hi) mutating all genes encoding the isozyme having the same activity simultaneously; and 
(iv) examining productivity by a fermentation method of the compound selected In (i) of the corynefomi bac- 
terium which have been transformed with the gene obtained In (iii). 

20 

59. A method for breeding a coryneform bacterium using the nucleotide sequence infonnatlon represented by SEQ 
ID N0S:2 to 3431 , comprising the following: 

(i) an^anglng a function information of an open reading frame (ORF) represented by SEQ ID N0S:2 to 3431 ; 
25 (j|) allowing the an-anged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 

pathway; 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 
In combination with Information relating known biosynthesis pathway or signal transmission pathway of a co- 
rynefomi bacterium; 

30 (iv) comparing the pathway explicated In (iii) with a biosynthesis pathway of a target useful product; and 

(V) transgenetically varying a coryneform bacterium based on the nucleotide sequence infonnatlon to either 
strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 
weaken a pathway which Is Judged not to be important in the biosynthesis of the target useful product In (Iv). 

35 60. A corynefomi bacterium, bred by the method of any one of claims 52 to 59. 

61 . The corynefomi bacterium according to claim 60, which is a microorganism belonging to the genus Corynebacte- 
rium, the genus Brevibactehum, or the genus Microbacterium. 

40 62. The coryneform bacterium according to claim 61 , wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterlum glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium caflunae, Corynebacterium tierculis, corynebacterium Hi- 
ium, Corynebacterium melassecola, Corynebacterium thermoamino genes, and Corynebacterium ammonia 
genes, 

45 

63. A method for producing at least onecorhpound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid and an analogue thereof, comprising: 

culturing a corynefonn bacterium of any one of claims 60 to 62 In a medium to produce and accumulate at 
so least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 

and analogues thereof; 
recovering the compound from the culture. 

64. The method according to claim 63, wherein the compound is L-lyslne. 

55 

65. A method for identifying a protein relating to useful mutation based on proteome analysis, comprising the following: 

(I) preparing 
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a protein derived from a bacterium of a production strain of a corynefonn bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

5 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iil) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a result of the comparison with, a peptidase 
10 to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

IS 66. The method according to claim 65, wherein the corynefomi bacterium is a microorganism belonging to the genus 
corynebacterium, the genus Brevibactenum, or the genus Microbacterium. 

67. The method according to claim 66, wherein the microorganism belonging to the genus Corynebacterium Is selected 
from the group consisting of Corynebacterium glutamicum, Coryr)ebacterium acetoacidophitum, Corynebacterium 

20 acetoglutamicum, Corynebacterium caiiunae, Corynebacterium hercuiis, Corynebacterium liiium, Corynebacteri- 

um melassecola, Corynebacterium tbermoaminogenes, and Corynebacterium ammoniagenes. 

68. A biologically pure culture of Corynebacterium giutamicum AHP-3 (PERM BP-7382) . 

25 
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